728x90
0

I have written kernel module to measure the correctness of ndelay() kernel function.

#include <linux/module.h>
#include <linux/init.h>
#include <linux/kernel.h>
#include <linux/time.h>
#include <linux/delay.h>
static int __init initialize(void)
{
    ktime_t start, end;
    s64 actual_time;
    int i;
    for(i=0;i<10;i++)
    {
        start = ktime_get();
            ndelay(100);            
        end = ktime_get();
        actual_time = ktime_to_ns(ktime_sub(end, start));
        printk("%lld\n",(long long)actual_time);    
    }
    return 0;
}

static void __exit final(void)
{
     printk(KERN_INFO "Unload module\n");
}

module_init(initialize);
module_exit(final);

MODULE_AUTHOR("Bhaskar");
MODULE_DESCRIPTION("delay of 100ns");
MODULE_LICENSE("GPL");

the dmesg output is like this:

[16603.805783] 514
[16603.805787] 350
[16603.805789] 373
[16603.805791] 323
[16603.805793] 362
[16603.805794] 320
[16603.805796] 331
[16603.805797] 312
[16603.805799] 304
[16603.805801] 350

I have gone through one of the posts in stackoverflow: Why udelay and ndelay is not accurate in linux kernel?

But I want a fine tuned nanosecond delay (probably in the range of 100-250ns) in kernel space. Can anyone please suggest me any alternative for doing this?

0

You can use

High resolution timers (or hrtimers)

    hrtimer_init
    hrtimer_start
    hrtimer_cancel

functions. An example is available here

0

If you are targeting x86 only system, you can use rdtsc() call to get the CPU clock counts. The rdtsc() api has very little overhead. But you do need to convert from CPU clock to the ns, it is dependent on how fast your CPU clock is running.

static unsigned long long rdtsc(void)
{
    unsigned int low, high;
    asm volatile("rdtsc" : "=a" (low), "=d" (high));
    return low | ((unsigned long long)high) << 32;
}

Otherwise you can use the kernel high resolution timers API.

  • The above code snippet returns the number of cycles or the time gap itself? – Bhaskar Jupudi Mar 31 '16 at 22:10
  • rdtsc return clock cycles only. It is fast, low overhead, but you do need to convert it base on how fast is your CPU and it is x86 only. – Jbobo Lee Apr 1 '16 at 0:35
  • Thanks for your comment. But I'm quite confused here. My CPU has 12 cores. Can you provide an example to convert the number of cycles that are obtained from rdtsc to actual time in nanoseconds? – Bhaskar JupudiApr 1 '16 at 1:18


+ Recent posts