Forbidden hobby projects
Music matching and iPhone location tracking
The magic of programming
●
Quick question to start out with:
●
Why did you start programming...?
The magic of programming
●
My motivations:
●
To play games (couldn't buy/download games in my youth)
●
Boredom
●
Most of all, to see “how it works”, uncovering the “magic”
The magic of programming
●
Everybody has moments in life when you think: How the hell did it do that!?
My (computer program) examples:
●
Going from ASCII/text to pixels, graphics
●
First time I heard a computer 'talk'
●
Digitally enhanching a photo
●
Real '3D' games, Unreal
●
First time I saw OCR
●
Worked with voice recognition
●
Finding your house on Google Earth, and again with Street View
●
And, the last time I had the feeling was when using Shazam / SoundHound
Music matching
●
Shazam / SoundHound?
1. Hold the application (in this case on iPhone) next to the speaker
2. Let it listen for a while
3. It tells you which song it is...
Music matching
●
So, again, how does it work!?
●
Once you know, it seems pretty straight forward... lets take a look:
The microphone
●
Implementing it for ourselfs in Java...!
●
First, Java and the microphone:
final AudioFormat format = getFormat();
//Fill AudioFormat with the wanted settings
DataLine.Info info =
new DataLine.Info(TargetDataLine.class, format);
final TargetDataLine line =
(TargetDataLine) AudioSystem.getLine(info);
line.open(format);
line.start();
Audio Format
●
The AudioFormat used is:
private AudioFormat getFormat() {
float sampleRate = 44100;
int sampleSizeInBits = 8;
int channels = 1;
boolean signed = true;
boolean bigEndian = true;
return new AudioFormat(
sampleRate, sampleSizeInBits, channels, signed, bigEndian);
}
Reading the sound data
●
So, now we have a TargetLineData
●
This can be read just like an InputStream:
out = new ByteArrayOutputStream();
running = true;
try {
while (running) {
int count = line.read(buffer, 0, buffer.length);
if (count > 0) {
out.write(buffer, 0, count);
}
}
out.close();
} catch (IOException e) {
e.printStackTrace();
System.exit(-1);
}
Sound, what does it look like?
●
… ok, so what have we actually read from the InputStream?
●
Well, this:
0
0
0
1
2
5
0
-1
-3
-4
-5
-2
0
1
2
0
2
(etc)
Wait, what!?
●
Does anybody know what this data represents?
●
Well, I didn't...
●
So I decided to plot it on a graph:
Vibrating membrane..?
●
The data is in “time domain”
●
Same as human ear:
●
This is useless to us...
One frequency
●
Blue: Time domain
●
Pink: Frequency domain
●
One single frequency will look/sound like one tone
Another frequency
●
Blue: Time domain
●
Pink: Frequency domain
●
A higher frequency will result in a higher tone
Frequencies combined
+
=
Fourier Transformation
●
To match music we need frequencies
●
So need to convert from time to frequency domain
●
This can be done with a Fourier Transformation
●
Implementation: Fast Fourier Transformation
Timing removed...
●
But...
●
There is a problem, the transformation will remove all time knowledge!
●
We'll know which frequencies are used
●
But not when they are used...!
Windowing, creating slices
●
To solve this, we need to “window” the transformation
●
We take slices of time-domain data, and transform them:
byte audio[] = out.toByteArray();
final int amountSlices = audio.length/Harvester.SLICE_SIZE;
Complex[][] results = new Complex[amountChucks][];
for(int slice = 0;slice < amountSlices; slice++) {
Complex[] complex = new Complex[Harvester.SLICE_SIZE];
}
for(int i = 0;i<Harvester.SLICE_SIZE;i++) {
complex[i] = new Complex(audio[(slice*Harvester.SLICE_SIZE)+i], 0);
}
results[slice] = FFT.fft(complex);
Spectum Analyzer
●
Now we have slices of frequency data
●
Where have I heard this before?
from wikipedia.org:
Spectum Analyzer
A spectrum analyzer or spectral analyzer is a device used to examine the
spectral composition of some electrical, acoustic, or optical waveform.
●
So with the current data, we have created a digital spectrum analyzer !!
Aphex Twin
●
Demo of the recording program as spectrum analyzer
●
Introducing Aphex Twin
Key moments
●
How are we going to match this data?
●
First we need to determine 'key moments'
●
Key moment:
●
The loudest frequencies for a given moment in time
Ranges
●
First we divide the frequencies in 5 ranges:
public static final int[] RANGE = new int[] {40,80,120,180, UPPER_LIMIT+1};
//Find out in which range
public static int getIndex(int freq) {
}
int i = 0;
while(RANGE[i] < freq) i++;
return i;
}
//0:
//1:
//2:
//3:
//4:
0
40
80
120
180
–
–
–
–
-
40
80
120
180
~~~
Saving the points
●
Now we get the points and save them:
//For every slice of data:
for (int freq = LOWER_LIMIT; freq < UPPER_LIMIT-1; freq++) {
//Get the log(magnitude):
double mag = Math.log(results[freq].abs() + 1);
//Use the getIndex method to determine our range:
int index = getIndex(freq);
}
//Save the highest magnitude and corresponding frequency:
if (mag > topMagnitude[index]) {
highestFrequency[index] = freq;
topMagnitude[index] = mag;
}
//Write the points to a file:
for (int i = 0; i < AMOUNT_OF_POINTS; i++) {
fw.append(highestFrequency[i] + "\t");
}
Result!
●
This results in a file which looks like this:
37
34
39
40
40
40
40
39
40
40
38
37
37
38
39
37
36
41
41
41
42
47
53
48
45
42
41
42
42
41
41
41
41
42
92 140 187
92 129 186
117 130 218
106 129 191
117 121 217
81 129 208
109 132 260
89 135 247
84 125 251
81 121 232
113 131 245
92 129 239
81 154 240
81 180 181
86 121 208
92 123 194
106 140 196
(etc...)
Demo, recording the points
●
Demo of 'key moments' with the recording-program
Matching, against....?
●
Now we have the data that, in theory, can be used to match
●
What are we going to match against?
●
Time to index my music collection
Harvesting music
public void harvest(File mp3Directory) {
String[] itemsInDirectory = mp3Directory.list();
}
for(String itemInDirectory:itemsInDirectory) {
if(itemInDirectory.endsWith(".mp3")) {
//Assume mp3 file
File mp3File = new File(mp3Directory, itemInDirectory);
captureAudio(mp3File);
} else if(new File(mp3Directory, itemInDirectory).isDirectory()) {
//Directory? Recurse!
harvest(new File(mp3Directory, itemInDirectory));
}
}
The magic of programming
●
●
Now we have:
●
Set of reference data (3000+ songs)
●
Method of capturing live data with microphone
Time to match against the samples
Hashing
●
Create a single hash-number per slice
private static final int FUZ_FACTOR = 2;
private long hash(String line) {
String[] p = line.split("\t");
long p1 = Long.parseLong(p[0]);
long p2 = Long.parseLong(p[1]);
long p3 = Long.parseLong(p[2]);
long p4 = Long.parseLong(p[3]);
//long p5 = Long.parseLong(p[5]);
//return
return
}
p4 * 100000000 + p3 * 100000 + p2 * 100 + p1;
(p4-(p4%FUZ_FACTOR)) * 100000000 +
(p3-(p3%FUZ_FACTOR)) * 100000 +
(p2-(p2%FUZ_FACTOR)) * 100 +
(p1-(p1%FUZ_FACTOR));
Matching the hashes
●
First method of matching:
1. Load all the reference hashes
2. Listen to the microphone and generate hashes
3. Find all matching hashes
4. Return the reference-song with most hits
●
This worked a little, but produced a lot of mis-hits
Second try...
●
Second generation, take advantage of timing:
1. Load all the reference hashes and their relative time in the song
2. Listen to the microphone and generate hashes
3. Align the hash hits in time
4. Group the hits by song & time
5. Return the song which has most hits in time
Explaining the matching
●
Example matches:
Microphone, slice nr 1, matches with:
●
Song 4, slice nr 4
●
Song 6, slice nr 9
Microphone, slice nr 2, matches with:
●
Song 4, slice nr 6
●
Song 6, slice nr 10
●
Song 8, slice nr 4
Microphone, slice nr 3, matches with:
●
Song 4, slice nr 5
●
Song 5, slice nr 3
Explaining the matching
●
Subtract moments in time:
Microphone, slice nr 1, matches with:
●
Song 4, slice nr 4 – 1 = 3
●
Song 6, slice nr 9 – 1 = 8
Microphone, slice nr 2, matches with:
●
Song 4, slice nr 6 – 2 = 4
●
Song 6, slice nr 10 – 2 = 8
●
Song 8, slice nr 4 – 2 = 2
Microphone, slice nr 3, matches with:
●
Song 4, slice nr 5 – 3 = 2
●
Song 5, slice nr 3 – 3 = 0
Explaining the matching
●
Group the unique song & subtracted timing:
2x: Song 6 & time 8
1x: Song 4 & time 3
1x: Song 4 & time 4
1x: Song 8 & time 2
1x: Song 4 & time 2
1x: Song 5 & time 0
●
Match is probably song 6, not song 4...!
Explaining the matching
●
Power of this algorithm (in theory):
●
Searching for corresponding hashes: O(1)
●
Then, for the limited set of matches, align timing
●
Produces very good results, very fast!
●
Note: My program reloads the hashtable each try, not effecient!
Demo time!
●
Time for another demo!
●
Want to try it out?
●
Bring your phone/mp3 player here!
●
Note: For it to work you'll need to have similair taste of music as me...
Other things you can do with this technology
●
Algorithm could also be used to:
●
Detecting duplicate songs in your music collection
●
Synchronising subtitles in movies (?!)
Forbidden..!
●
Final warning:
I am Darren Briggs, the Chief Technical Officer of Landmark Digital
Services, LLC. Landmark Digital Services owns the patents that cover the
algorithm used as the basis for your recently posted “Creating Shazam In
Java”. While it is not Landmark’s intention to alienate those in the Open
Source and Music Information Retrieval community, Landmark must request
that you do not ship, deploy or post the code presented in your post.
Landmark also requests that in the future you do not ship, deploy or post
any portions or versions of this code in its current state or in any
modified state.
We hope you understand our position and that we would be legally remiss not
to make this request. We appreciate your immediate attention and response.
Best regards,
Darren P. Briggs
Vice President &
Chief Technical Officer
Landmark Digital Services, LLC
iPhone location data
●
Suddenly in the news:
●
●
Proof of concept released, but...
●
●
The iPhone stores location/GPS data, and stores this on your PC
Only works on the Mac
I want to see this too!!
iPhone location data
●
What is inside an iPhone backup?
iPhone location data
●
●
What we need, the file containing location data:
●
“Library/Caches/locationd/consolidated.db”
●
(in my case: 4096c9ec676f2847dc283405900e284a7c815836)
To find that file:
●
Manifest.mbdx
●
Manifest.mbdb
Manifests
●
●
Manifest.mbdx
●
Contains a list of hex-filenames and pointers into the MBDB file
●
Like this:
–
4096c9ec676f2847dc283405900e284a7c815836 : 23
–
84c815832834059007dc76f2e284a740696c9ec6 : 57
–
etc...
Manifest.mbdb
●
Contains information about the files without mentioning their name
●
Like this:
–
23 : real filename, domain, filemode, last modified, etc
–
57 : real filename, domain, filemode, last modified, etc
–
etc...
consolidated.db
●
Parse both files to find the hex-filename, and the real filename.
●
The we have: “Library/Caches/locationd/consolidated.db”
●
This is a SQLite file.
“SQLite is a software library that implements a self-contained, serverless,
zero-configuration, transactional SQL database engine.”
consolidated.db
●
This file is easily parsed with SQLJet (http://sqljet.com/)
SqlJetDb db = SqlJetDb.open(dbFile, false);
db.beginTransaction(SqlJetTransactionMode.READ_ONLY);
try {
//TODO: Extend to use: db.getTable("WifiLocation");
ISqlJetTable table = db.getTable("CellLocation");
extractGeoDataFromTable(geoData, table);
Getting lat/long positions
●
Now we can read the lat/long positions:
ISqlJetCursor cursor = table.open();
try {
if (!cursor.eof()) {
do {
double latitude = cursor.getFloat("Latitude");
double longitude = cursor.getFloat("Longitude");
double timestamp = cursor.getFloat("Timestamp");
int confidence = (int) cursor.getInteger("Confidence");
Displaying the geo locations
●
As display I chose Google Earth, the KML file format.
●
Simple XML:
<?xml version="1.0" encoding="UTF-8"?>
<kml xmlns="http://www.opengis.net/kml/2.2">
<Document>
<Placemark>
<name>25-jun-2010 1:20:10</name>
<Point><coordinates>4.28304672,51.95279973,0</coordinates></Point>
</Placemark>
<Placemark>
<name>25-jun-2010 1:25:39</name>
<Point><coordinates>4.32921826,52.00744783,0</coordinates></Point>
</Placemark>
...
iPhone, final notes
●
●
Javascript version:
●
A friend translated the code, and wrote an SQLite library to Javascipt
●
Can be found here: http://markolson.github.com/js-sqlite-map-thing/
Problems:
●
●
Apple's latest firmware deletes all data, and it is no longer backupped.
Demo!
Questions?
●
Any questions?
© Copyright 2026 Paperzz