A microsecond-level performance profiling library for gint.
您最多选择25个主题 主题必须以字母或数字开头,可以包含连字符 (-),并且长度不得超过35个字符
Lephenixnoir e4cedf72a9 use the target's archiver to create the library 3 个月前
.gitignore build system 4 个月前
Makefile use the target's archiver to create the library 3 个月前
README.md add README 4 个月前
libprof.c basic working implementation with manual calls 4 个月前
libprof.h basic working implementation with manual calls 4 个月前

README.md

libprof: A performance profiling library for gint

libprof is a small gint library that can be used to time and profile the execution of an add-in. Using it, one can record the time spent in one or several functions to identify performance bottlenecks in an application.

libprof’s measurements are accurate down to the microsecond-level thanks to precise hardware timers, so it can also be used to time even small portions of code.

Building

libprof is built only once for both fx-9860G and fx-CG 50, but if you use different compilers you will need to install it twice. The dependencies are:

  • A GCC cross-compiler for a SuperH architecture
  • The gint kernel

The Makefile will build the library without further instructions.

% make

By default sh3eb-elf is used to build; you can override this by setting the target variable.

% make target=sh4eb-elf

Install as usual:

% make install
# or
% make install target=sh4eb-elf

Basic use

To access the library, include the <libprof.h> header file.

#include <libprof.h>

For each function you want to time, libprof will create a counter. At the start of the program, you need to specify how many functions (libprof calls them contexts) you will be timing, so that libprof can allocate enough memory.

libprof also needs one of gint’s timer to actually measure time; it must be one of timers 0, 1 and 2, which are the only one precise enough to do this job. You can use any timer which you are not already using for something else.

These settings are specified with the prof_init() function.

/* Initialize libprof for 13 contexts using timer 0 */
prof_init(13, 0);

You can then measure the execution time of a function by calling prof_enter() at the beginning and prof_end() at the end. You just need to “name” the function by giving its context ID, which is any number between 0 and the number of contexts passed to prof_init() (here 0 to 12).

void function5(void)
{
	prof_enter(5);
	/* Do stuff... */
	prof_leave(5);
}

This will add function5()’s execution time to the 5th counter, so if the function is called several times the total execution time will be recorded. This way, at the end of the program, you can look at the counters to see where most of the time has been spent.

To retrieve the total execution time of a function, use prof_time() :

uint32_t total_function5_us = prof_time(5);

This time is measured in microseconds, even though the timers are actually more precise than this. Note that the overhead of prof_enter() and prof_leave() is usually less than 1 microsecond, so the time is very close to the actual time spent in the function even if the context is frequently entered and left.

At the end of the program, free the resources of the library by calling prof_quit().

prof_quit();

Managing context numbers

The number of contexts must be set for all execution and all context IDs must be between 0 and this number (excluded). Managing the numbers by hand is error- prone and can lead to memory errors.

A simple way of managing context numbers without risking an error is to use an enumeration.

enum {
	/* Whatever function you need */
	PROFCTX_FUNCTION1 = 0,
	PROFCTX_FUNCTION2,
	PROFCTX_FUNCTION3,

	PROFCTX_COUNT,
};

Enumerations will assign a value to all the provided names, and increment by one each time. So for example here PROFCTX_FUNCTION2 is equal to 1 and PROFCTX_COUNT is equal to 3. As you can see this is conveniently equal to the number of contexts, which makes it simple to initialize the library:

prof_init(PROFCTX_COUNT, 0);

Then you can use context names instead of numbers:

prof_enter(PROFCTX_FUNCTION1);
/* Do stuff... */
prof_leave(PROFCTX_FUNCTION1);

If you want to use a new context, you just need to add a name in the enumeration (anywhere but after PROFCTX_COUNT) and all IDs plus the initialization call will be updated automatically.

Timing a single execution

prof_enter() and prof_leave() will add the measured execution time to the context counter. Sometimes you want to make individual measurements instead of adding all calls together. To achieve this effect, clear the counter before the measure using prof_clear().

Here is an example of a function exec_time_us() that times the execution of a function f passed as parameter.

uint32_t exec_time_us(void (*f)(void))
{
	int ctx = PROFCTX_EXEC_TIME_US;
	prof_clear(ctx);
	prof_enter(ctx);

	f();

	prof_leave(ctx);
	return prof_time(ctx);
}

Exploiting the measure’s precision

The overhead of prof_enter() and prof_leave() is usually less than a microsecond, but the starting time of your benchmark might count (loading data from memory to initialize arrays, performing function calls…). In this case, the best you can do is measure the time difference between two similar calls.

If you need something even more precise then you can access libprof’s counter array directly to get the timer-tick value itself:

uint32_t elapsed_timer_tick = prof_elapsed[ctx];

The frequency of this tick is PΦ/4, where the value of PΦ can be obtained by querying gint’s clock module:

#include <gint/clock.h>
uint32_t tick_freq = clock_freq()->Pphi_f / 4;

One noteworthy phenomenon is the startup cost. The first few measurements are always less precise, probably due to cache effects. I frequently have a first measurement with an additional 100 us of execution time and 3 us of overhead, which subsequent tests remove. So it is expected for the first few points of data to lie outside the range of the next.