Wednesday, June 29, 2011

Adding visual audio level monitors

With the ability to overlay images to the video stream tackled the next task was to analyse the audio and overlay VU meters to allow us to see how noisy the little tyke is without having to actually listen to him (it is a him btw, he popped out on the 22nd June).

The camera uses the linux open sound system (OSS), which has been replaced by ALSA in linux 2.5 and above and isn't particulary nice to work with. The camera streaming server (camserv) , for which there is no source code, opens the audio-in device on startup and as it can only be opened once (and without it being available camserv refuses to work) I had to look at other avenues to get the audio.

Fortunately, camserv actually offers the audio out in a mangled stream which can be found at http://[cam-ip]/cgi/audio/audio.cgi - this stream provides chunks of audio data in 16bit PCM stream at what appears to be about 8KHz (although the frequency is only being guessed at). The stream isn't presented in a friendly format (i.e. browsers don't recognise it and simply try and download it as a series of files) but it is quite easy to write a bit of code to connect to the port and analyse the audio.

I put together some code to connect to the stream and display the current, peak and average volume overlayed on the video stream. It uses some of the same code used for the text display on the previous entry and seems to run around 1/4 sec behind the actual audio (this is due to the audio being streamed locally, analysed and then overlayed onto the video).

I have included the source code and a compiled version here, and below is a short video of it being streamed to an iPad:




And here it is on Apple TV2 via XBMC:



To use the code on your camera it does need to authenticate to the local camera web server; it defaults to username:admin pasword:admin as do the cameras out of the box, if you are using your own username and password (and I strongly suggest you do) you will need to specify the base64 encode of the username and password on the command line via -a . There are a number of websites which can do is encode so I haven
't included this in the code; the format needs to be a hash of the username:password (e.g. 'admin:admin' which has a hash of 'YWRtaW46YWRtaW4=')


Wednesday, June 8, 2011

Overlaying text and images into the video stream

Having tinkered around with the toolchain for a while I have turned my attention to one of the core tasks of the project - that is overlaying text and images in the video stream.

The video capture, encoding and motion detection are all handled by the plmedia.o module; fortunately the source code for this module is provided in the toolchain, although there is no documentation.

After a fair bit of scratching my head it is apparent that the video capture hardware writes directly to memory on demand (by the module) and then a further request is sent to the encoder hardware which encodes/compresses it directly to another buffer. This is the buffer which is used by the device when stremaing video. There is further hardware which performs motion detection, using the video capture buffer.

I decided to inject into the video stream directly before the encoding takes place in order to be sure (with as much certainty as possible) that the frame has been captured in full.

plmedia.o source is made up of 3 core components: plgrabber, plencoder and plmd. plencoder is the component responsible for managing the encoding process and this is where I have added my code.

The video is captured in YCbCr format with 4.1.1 ratio between the luminence and the chroma channels - that is, the Y channel is at 640x480 resolution with 8 bpp and the Cb & Cr channels are at 320x240 resolution with 8 bpp.

Each channel is captured simultaniously in its own buffer, I added an IOCTL command to pass a structure which contains the dimensions of the image to be added, a TTL (so images can expire) and pointers to buffers which contain appropriate images. The definitions for the IOCTL command and the buffer can be found in the source below in include/video/plmedia.h.

The heavy lifting is done in the plencoder.c source; look for a function called OverlayImages which iterates through the image buffers before each encode - it is interesting to note that I was having huge issues with image corruption which took some real head scratching to solve - the images would appear to be corrupt with some elements of the video image showing through - it turns out that the processor has a little cache which needs to be flushed otherwise it doesn't write to the buffer before the encoder kicks in.

In EncIOCTL you will find the new function to add an image to the buffer list - there is functionality to replace an existing buffer or add a new one - the replace function is useful for adding animation or moving images to the buffer (such as a time stamp) without needlessly wasting resources.

Finally in InitEncoder and CleanupEncode is addional code to create and release the buffers.

There is an arbitary limit of upto 6 independent overlays and this can be increased by modifying the IMAGEBUFFERS constant in the plencoder.h

The image buffers must be passed in the right format - and this is where it can get tricky - images need to be converted to YCbCr format and at the right resolution for each channel - the co-ordinates and dimensions refer to the Y channel and are appropriatly converted for the CbCr channels.

I have also created an applet to add text to the video stream as part of the babycam project; the code is attached below. It uses a pre-created BMP consisting of a matrix of characters white on black. The code extracts the relevant characters, builds a buffer, converts it and then sends it to the module. The same code can be used to send a BMP although it needs to be in 24bpp uncompressed format (regular windows .bmp's are fine).

The sample below shows a screen grab of the the module working with 2 overlays added by the displaytext applet - although they are in black and white there is full colour support.















And here is the same scene being streamed to an iPad:




The source and cross compiled code for the module can be downloaded here

The source and cross compiled code for the text overlay applet can be downloaded here