Home | Papers | Software | Blog | Tips | CV Contact

This page contains a list of general tips on several different subjects. The list is continuosly under construction.




Performance Counter

Often it happens that you have to evaluate performance of an application and you need higher resolution measurement than those provided by gettimeofday. Luckily the Pentium processor provide a register called RDTSC (ReaD Time Stamp Counter) which is incremented at every single clock tick. The following code shows how you can acess this register (it assumes gcc as compiler).


     typedef unsigned long long int rdtsc_t;

     inline rdtsc_t
     rdtsc(void)
     {
        unsigned long long int r;

        __asm__ __volatile__ ("rdtsc"
                              : "=A" (r) // Output
                             );
        return r;
     }
   
Notice that this returns the number of processor clock ticks since your machine was booted. Thus, if you want to get a time you need to multiply it for the inverse of the frequency of your machine. Beware that newer Linux Kernel take advantage of frequency scaling, thus whenever you have to do performance evaluation make sure that you disable the cpuspeed daemon, for instance by executing the following command:

       # service cpuspeed stop
   


Efficient Cell Synchronization

Cell provides mailboxes in order to nicely synchronize SPEs and PPE. However, there is something that should be taken into account by programmers who care about performances. When the SPE use mailboxes to communicate with the PPE, the latter performs a DMA transaction for each tentative read of the mailbox. The end result is that a lot of bus traffic is generated, which might adversely impacts performances. Another approache that, while less user friendly, ensure maximum performances is tu rely on Spinlocks and DMA transfers. The basic idea is to have the PPE and the SPE coordinate by writing some agreed areas of the main memory and local store respectively. The code below shows how you might achieve this. Actually the sample code, allows you to measure the performance of this spinlock synchronization, which you can easily compare with mailbox based synchronization.


 
---------- SPE Code ------------
#include 
#include 

#include 
#include 

// -- Common Include --
#include "cbench/common.h"

extern spe_program_handle_t cbench_spinlock_spu;

volatile unsigned long long spinlock     __attribute__((aligned(128)));
unsigned long long          spinlock_spu __attribute__((aligned(128)));

int main(int argc, char* argv[])
{
  speid_t speid;
  int     status;
  int     tagid = 1;

  unsigned int run;

  char TEST_ID[32] ="SPINLOCK:PPU>";
 
 
  if (argc < 2) {
      printf("USAGE:\n\tspinlock_spu \n");
      return 1;
  }

  run = atoi(argv[1]);
  if (run == 0) {
      run = 1;
  }


  printf("%s PPU Spinlock at [0x%p]\n", TEST_ID, &spinlock);
  spinlock = 0;

  speid = spe_create_thread( 0,
                 &cbench_spinlock_spu,
                 (unsigned long long*)&spinlock,
                 (unsigned long long*)run,
                 -1,
                 0 );
 
  if(speid == 0){
    perror( "Unable to create SPE thread\n");
    return -1;
  }
  unsigned int i;

  for (i = 0; i < run; ++i) {
      while (spinlock == 0) { }
      spinlock_spu = spinlock;
      spinlock = 0;

      /* Now the spinlock contains the LS address for the SPU spinlock */
      spe_mfc_get(speid,
          spinlock_spu,
          (void*)&spinlock_spu,
          sizeof(spinlock_spu),
          tagid,
          0,
          0);
  }

  spe_wait(speid, &status, 0);
  printf("%s DONE!\n", TEST_ID);
  return 0;
}


--------- SPE Code --------------
#include 
#include 
#include 
#include 
#include 
#include 

volatile unsigned long long spinlock     __attribute__ ((aligned (128)));


/*
 * This has to be an unsigned long long, just because we need to write
 * this much on the destination spinlock.
 */
unsigned long long spinlock_spu_ls __attribute__ ((aligned (128)));
unsigned int       spinlock_ppu_ea __attribute__ ((aligned (128)));

int main(unsigned long long spuid,
     addr64 argp,
     addr64 envp)
{
  int tag_id = 0;

  char TEST_ID[32] ="SPINLOCK:SPU>";

  spinlock_ppu_ea = argp.ui[1];
  spinlock_spu_ls = (unsigned int)&spinlock;

  spinlock = 0;

  unsigned int run = envp.ui[1];


  sim_printf("=================================================\n");
  sim_printf("%s TEST STARTED\n", TEST_ID);
  sim_printf("%s Performing %u Spinlock measurements\n", TEST_ID, run);
  sim_printf("%s PPU Spinlock at  [0x%x]\n", TEST_ID, spinlock_ppu_ea);
  sim_printf("%s SPU Spinlock at  [0x%llx]\n", TEST_ID, spinlock_spu_ls);
 
  unsigned int i;
  for (i = 0; i < run; ++i) {
      prof_cp0();
      prof_cp30();
     
      /*
       * Write the SPU Spinlock EA into the PPU spinlock EA.
       */
      spu_mfcdma32(&spinlock_spu_ls,
           (unsigned int)spinlock_ppu_ea,
           sizeof(spinlock_spu_ls),
           tag_id,
           MFC_PUT_CMD);
     
      mfc_write_tag_mask(1 << tag_id);
      mfc_read_tag_status_all();
     
      /*
       * Wait for the PPU to activate an SPU initiated DMA to set the spinlock.
       */
      while (spinlock == 0) { }

      prof_cp31();
      sim_printf("======================[%d]=========================\n", i);
      spinlock = 0;
  }
  sim_printf("%s TEST COMPLETED\n", TEST_ID);

  sim_printf("=================================================\n");
  return 0;
}
  


Setting up a Webcam on Linux

For most Webcams all it takes is to install the spca5xx driver. After you have installed that you should be able to check the video stream with either amsn or with ekiga (former gnomemeeting). BTW, if you have not tryied it yet, with amsn you can easily make video-conference with friends running MS msn.


Ekiga and Messagenet.it

Messagenet provides free Phone-to-PC VoIP communication, and on Linux you can easily use Ekiga as VoIP terminal. To configure it for using the messagenet network is rather straightforward.


403 Forbidden & Fedora Core 5

If you have SELinux enabled, and with default configuration. beware that Apache won't be able to read the public_html directory unless you either disable SELinux for HTTPD or tweak its configuration. As I use Apache on my machine only for testing my website before deploying it I typically leave SELinux disabled. To do it, click on [System|Administration|Security Level and Firewall], and then on the SELinux tab do as illustrated below.


Home | Papers | Software | Blog | Tips | CV Contact
xhtml1 / css2