BeamformIt
(The fast and robust acoustic
beamformer)
BeamformIt is an acoustic
beamforming tool that accepts a variable amount of input channels and
computes an output via a filter&sum beamforming technique.
BeamformIt was originally implemented by Xavier Anguera at ICSI for
participation to the NIST RT05s Meetings evaluation to deal with the
different number of microphone channels available in a meeting room.
BeamformIt was then rewritten and improved for the RT06s evaluation and
finally readjusted and documented for public release.
BeamformIt was initially focused towards processing the data used in
the RT evaluations but it can now process all sorts of data, leading to
an homogeneous single output (SHP format, 16K, 16bits/sample).
If you use the software and would like to cite it, you can use any of the following citations:
- "Acoustic beamforming for speaker diarization of meetings", Xavier Anguera,
Chuck Wooters and Javier Hernando, IEEE Transactions on Audio, Speech and Language Processing, September 2007, volume 15, number 7, pp.2011-2023.
- "Robust Speaker Diarization for Meetings", Xavier Anguera, PhD Thesis, UPC Barcelona, 2006.
Latest NEWS: I have created a project in SourceForge for BeamformIt where all people working on its development
can join in the effort. I will be discontinuing the releases through this website once SourceForge is up and running strong.
-
Changes history
-
Version 1.0: Initial public version, includes
some documentation, a "clean" code and RT06s conf. room example scripts.
-
Version 1.1:
-
Source code now has its own directory, making the structure a bit more organized
-
Some warnings and obsolete functions were eliminated in the source code.
-
Documentation now can be compiled directly from the Makefile
-
Other bits and pieces.
-
Version 2.0: Some fundamental improvements were made from verion 1.1
-
Solved some memory leaks potentially dangerous.
-
Added a fourth library requirement: libsamplerate and use it to automatically convert to 16k all input
files with different samplerate.
-
Made the code independent on the input format data. When it is of different format than expected it
automatically converts it.
-
All temporal wavefiles are format WAV in floating point, little endianness.
-
Internal rework of all variables to handle
acoustic data in floating point precision, therefore accepting a
broader range of input data.
-
Output signal weighting_factor and output
normalization changed to accomodate the floating point data.
-
Changed several libraries to link dynamically when possible.
-
Input parameters functions made robust to
errors in the parameters passed in. In the previous version the system
crashed without an error message if a parrameter was misspelled.
-
Added running and configuration scripts for
RT06s lecture room data, changed the names of the conf. room data to
accomodate.
-
Variable min_xcorr used some hard-coded
values according to each meeting. These have been eliminated.
-
Solved a bug when exctracting multiple channels into independent files. This function has been sped-up by
using buffers.
-
Solved a bug that caused the signal to groud
forever when the input signal was 0 for one analysis window.
-
Implemented different levels of printout information to accomodate different uses.
-
Version 3.0 (development version):
-
Another major rework is made, converting all the code into a
more C++ style. NO longer a huge functions.cpp file
-
Added a .vcproj project to open Beamformit as a project in Visual Studio.
-
Version 3.1:
-
Implemented the variable reference channel selection algorithm,
which allows the system to switch reference channel to use that with
highest quality according to xcorr metrics. You can enable it with
do_compute_reference=2
-
Some parameters tunning is performed using signal quality tools
and some default parameters are changed. In the cfg-files two versions
are now present for the example meeting sets, with the old and new
parameters.
-
Version 3.3:
-
Some major bugs are fixed which did not allow to run
do_compute_reference=2 with acoustic optimization
(See the documentation inside the project for later versions)
- News
-
27/07/2006: A
google group has been created for correspondence relating BeamformIt.
Check under the "Google groups" item to browse messages and to sign up.
If you are interested in the software, please sign up so that all
correspondence of public interest can get archived.
-
29/10/2006: Finally done with writting my thesis! now I can
acctually work a bit more on the BeamformIt code.
-
20/11/2006: The first public version (1.0) is out
-
27/11/2006: After comments from "inside" people, I have made
some changes to the code and here comes a new version, 1.1
-
04/12/2006: Lots of work has been done in usability to run the
system without pre-processing on all sorts of data. Some bugs were
solved and many improvements were put in place, leading to version 2.0
-
08/02/2009: Version 3.3 is public, which includes a complete rework of the code.
-
03/02/2010: Versions 3.4 and 2.1 are public. They are basically bugFixes of the code. In the case of 3.4 these fixes make
usable the code rework done in version 3. Unless major bugs are found in 3.4 (which I doubt now), this is going to become
the only version I will keep improving.
A google groups emailing list has
been set for any communication refering this software. Please sign up
or browse the archives through the links below.
- Future work
- Add support for noise filtering techniques
(like Wiener) from within the beamforming program (useful when
multichannel files with non-standard file formats are to be processed).
- I'm open to your suggestions.
Xavier Anguera (xanguera __--at--__
gmail.com), 2009