Join us now!

Username
Password
Verification
	Stay logged in

Forgot Your Password? Forgot your Username? Haven't received registration validation E-mail?

User Control Panel Log out

Forums
Posts

Latest Posts

Active Posts

Recently Visited

Search Results

View More
Blog

Recent Blog Posts

View More
Photos

Recent Photos

My Favorites

View More Photo Galleries
PMs

Unread PMs

Inbox

Send New PM View More
Page Extras
Menu
- Forum Themes

Cakewalk
- Products
- Support

Mark Thread UnreadFlat Reading Mode ❐

The Ghost in the MP3

Author Post Essentials Only Full Version
Paul P Max Output Level: -48.5 dBFS Total Posts : 2685 Joined: 2012/12/08 17:15:47 Location: Montreal Status: offline 2015/05/02 22:52:36 (permalink) The Ghost in the MP3 Heard an interesting interview on the radio about this academic computer music guy who composes using what is lost when a wav is converted to any mp3. The ghost (what is 'removed' from the original) is then used as raw material for composition and effects. Sounds pretty cool. Web page with lots of info and the paper that describes the project in more detail. Here's the interview (7min) but the composer doesn't say how he does what he does. I know nothing of how an mp3 is produced, but it's looks like it's a lot more complicated than I'd imagined and so is the means of bringing the ghost back to life, though I haven't read the documentation in detail. I would have thought you could just invert the mp3 and sum it with the original. I'm going to have to try that out. post edited by Paul P - 2015/05/02 23:03:12 Sonar Platinum [2017.10], Win7U x64 sp1, Xeon E5-1620 3.6 GHz, Asus P9X79WS, 16 GB ECC, 128gb SSD, HD7950, Mackie Blackjack #1 1 Reply Related Threads
bitflipper 01100010 01101001 01110100 01100110 01101100 01101 Total Posts : 26036 Joined: 2006/09/17 11:23:23 Location: Everett, WA USA Status: offline Re: The Ghost in the MP3 2015/05/03 10:09:54 (permalink) Yes, that's how I'd go about it: invert and sum. The author does indeed start there, but then applies a precise analysis of the spectral differences by programmatically comparing each FFT bin individually. It would not, however, be necessary to go to such lengths if you just wanted to do some experiments for yourself. What would surprise me is if that "ghost" information turned out to be musically significant. I wouldn't expect it to be. But it might be a fun experiment to try. People tend to freak out when they hear what's been excised by MP3 encoding. Lots of high-frequency information and transients are given up. Your first reaction is "that stuff doesn't sound insignificant!". But in theory, it's all information that your ears would have naturally filtered out anyway, so nothing of importance is lost. The problem with theory versus practice is that between what's clearly audible and what's clearly inaudible lies a large grey area where an element may or may not be audible depending on many factors. Pick up a book on perceptual encoding and look at the standard graph given for the temporal-masking "shadow". That graph was arrived at by averaging the results of subjective listening tests by many people. Chances are, it doesn't match your own shadow exactly, just as you probably don't match the average height, weight, eye color, tolerance to cold weather or hot peppers. The encoder has to assume that your perception is close enough to the average. It also has to ignore some variables in the name of efficiency. It cannot know with certainty that an element is definitely audible or inaudible to any specific listener. Anyhow, that's all academic because for me, 192 kb/s or higher sounds OK, and - for me - 320 kb/s is indistinguishable from the original wave file. The author actually had to jump through some hoops to get any usable information from his "ghost" data. All else is in doubt, so this is the truth I cling to. My Stuff #2

Author

Post

Essentials Only Full Version

Paul P

Max Output Level: -48.5 dBFS

Total Posts : 2685
Joined: 2012/12/08 17:15:47
Location: Montreal
Status: offline

2015/05/02 22:52:36 (permalink)

The Ghost in the MP3

Heard an interesting interview on the radio about this academic computer music guy who composes using what is lost when a wav is converted to any mp3. The ghost (what is 'removed' from the original) is then used as raw material for composition and effects. Sounds pretty cool. Web page with lots of info and the paper that describes the project in more detail. Here's the interview (7min) but the composer doesn't say how he does what he does.

I know nothing of how an mp3 is produced, but it's looks like it's a lot more complicated than I'd imagined and so is the means of bringing the ghost back to life, though I haven't read the documentation in detail. I would have thought you could just invert the mp3 and sum it with the original. I'm going to have to try that out.

post edited by Paul P - 2015/05/02 23:03:12

Sonar Platinum [2017.10], Win7U x64 sp1, Xeon E5-1620 3.6 GHz, Asus P9X79WS, 16 GB ECC, 128gb SSD, HD7950, Mackie Blackjack

1 Reply Related Threads

bitflipper

01100010 01101001 01110100 01100110 01101100 01101

Total Posts : 26036
Joined: 2006/09/17 11:23:23
Location: Everett, WA USA
Status: offline

Re: The Ghost in the MP3 2015/05/03 10:09:54 (permalink)

Yes, that's how I'd go about it: invert and sum. The author does indeed start there, but then applies a precise analysis of the spectral differences by programmatically comparing each FFT bin individually. It would not, however, be necessary to go to such lengths if you just wanted to do some experiments for yourself.

What would surprise me is if that "ghost" information turned out to be musically significant. I wouldn't expect it to be. But it might be a fun experiment to try.

People tend to freak out when they hear what's been excised by MP3 encoding. Lots of high-frequency information and transients are given up. Your first reaction is "that stuff doesn't sound insignificant!". But in theory, it's all information that your ears would have naturally filtered out anyway, so nothing of importance is lost.

The problem with theory versus practice is that between what's clearly audible and what's clearly inaudible lies a large grey area where an element may or may not be audible depending on many factors.

Pick up a book on perceptual encoding and look at the standard graph given for the temporal-masking "shadow". That graph was arrived at by averaging the results of subjective listening tests by many people. Chances are, it doesn't match your own shadow exactly, just as you probably don't match the average height, weight, eye color, tolerance to cold weather or hot peppers.

The encoder has to assume that your perception is close enough to the average. It also has to ignore some variables in the name of efficiency. It cannot know with certainty that an element is definitely audible or inaudible to any specific listener.

Anyhow, that's all academic because for me, 192 kb/s or higher sounds OK, and - for me - 320 kb/s is indistinguishable from the original wave file. The author actually had to jump through some hoops to get any usable information from his "ghost" data.

All else is in doubt, so this is the truth I cling to.

My Stuff

Jump to: