Decrease font size
Increase font size
Topic Title: Big Analogue Data - why bigger isn't always better
Topic Summary: JPEG is not a good example
Created On: 18 October 2013 03:21 PM
Status: Read Only
Related E&T article: Big Analogue Data - why bigger isn't always better
Linear : Threading : Single : Branch
Search Topic Search Topic
Topic Tools Topic Tools
View similar topics View similar topics
View topic in raw text format. Print this topic.
 18 October 2013 03:21 PM
User is offline View Users Profile Print this message


Posts: 1
Joined: 12 January 2003

Whilst the article is very interesting, the example of JPEG compression == "without loss of information" is not a good one. It is well know that JPEG compression is "lossy". Term "perfectly good" depends on what you want to use the image of the flower for. Its fine for viewing with the eye but what if you were interested in analysing the pattern of veins in the petals?
 19 October 2013 12:34 AM
User is offline View Users Profile Print this message


Posts: 605
Joined: 17 September 2001

I don't see where it says that JPEG is "without loss of information". That said, the JPEG standard does have a lossless mode. However, is is little known and rarely implemented.

S P Barker BSc PhD MIET
 19 October 2013 11:45 AM
User is offline View Users Profile Print this message


Posts: 433
Joined: 15 April 2013

I am intrigued.

How can you compress a string of data without losing information? It could be that the information which is lost or rejected is not relevant to the discussion, but it is still information that is not being transmitted?

Ken Green
 19 October 2013 03:12 PM
User is offline View Users Profile Print this message


Posts: 48
Joined: 14 July 2003

Let's answer this by example.

Suppose for the purpose of illustration that our space of bytes to be compressed can only contain upper case letters, digits, and the symbols +-*/#

Further, suppose that the symbol # is VERY rarely seen.

Then the string


can be compressed as


(just as an example)

by using the protocol that any # character is encoded by ##, any alphabetic character that occurs more than three times is encoded by inserting a single # character, a count and then the alphabetic character that is repeated, and digits are just left as is.

That string breaks down as follows:-

#7A = seven consecutive A characters
## = a single #
/+ = /+
#5B = five consecutive B characters
#10C = ten consecutive C characters
-*557 = -*557
## = a single # character
#6E = six consecutive E characters
46888 = 46888

Compressing a 43-character string down to 29 characters.

(Obviously real-world compression algorithms are a lot more sophisticated than this, and potentially operate on repeat units larger than a single character, but you get the idea.)

 19 October 2013 03:13 PM
User is offline View Users Profile Print this message


Posts: 521
Joined: 14 September 2010

It is quite simple to compress data without losing information. More complex non-lossy compressions are more difficult.

A simple way to compress picture data would be to store a binary bitmap. Where there are a sequence of repeating bits you only have to store the first one and a series of bits to descibe how many bits are to be repeated. This can result in quite drastic file reductions for simple images, but is not that impressive for complex images. Another algorithm could look for repeated patterns of binary bits, in say a 2x2 grid or 4x4 or more, and swap them for a smaller binary sequence.

You could send a text file by swapping out words that are commonly used for a short binary number; or you could not send any vowels and use a look-up table at the other end to put them back in.

So, yes, the inforamation is not being transmitted as you say, but the compressed and decompressed data is exactly the same and as good as information that has not been compressed.
 19 October 2013 04:29 PM
User is offline View Users Profile Print this message


Posts: 433
Joined: 15 April 2013

ah, I see, said the blind man.

What I don't see is what is new in all this? For example: in the wild simple survival depends on speed of response which, in its turn, depends on speed of sensors. I think most people look upon the eye as a detector of superior quality, whereas in fact it is a pretty lousy performer? For one thing, whatever its output may be, it is of little use in pre-empting an attack by a predator because it records information from 3-D onto a 2 D screen which is curved concave fashion in 3-D - the information which finally reaches the brain is computed in the light of experience gathered since leaving the womb. Nevertheless, it will detect a predator at a distance such that the object is extremely small.

The miracle is achieved by com paring a string of images and noting small changes from frame to frame - a technique which has been adopted by "clever" television engineers. As ever nature got there long before the arrival of Homo sapiens, but then mother nature had the advantage of not having to rely on the idiosyncrasies of these clumsy digital computers.

KeNn Green
 19 October 2013 04:54 PM
User is offline View Users Profile Print this message


Posts: 521
Joined: 14 September 2010

That's a tangential response Ken.

In fact, the brain uses very similar pattern recognition to the ones described. Patterns will be compared against existing "templates", the most well know example of this is pareidolia which causes people to be convinced they have seen angels/ghosts/aliens instead of merely rustling leaves - and so deep and hard-wired is the brain's repsonse that there is often no convincing people otherwise - they really have seen something that is not there.

As per your example of the predator - it was always in our ancestor's best interests to assume the rustling in the leaves and the shadowy face in the bushes WAS a lion and run away, than to be rational. The risk was not worth it - on the odd occasions it was a lion, the rational hunter got eaten!

The brain will also only notice changes in a scene rather than analysing the entire scene all of the time - and is easily tricked. There are famous examples on the net of (fake) newsreaders changing their clothes mid report and you do not notice. The brain is not expecting a change in scene, so even when there is one, it goes completely unnoticed. A trick sleight of hand magicians use to their advantage.

So another way to compress data might be to only transmit what has changed in a scene, rather than the whole scene.

And of course, we use compression all the time. Words are a form of compression. You can say the word "banana" and the hearer will have an impression of what you are talking about. You don't need to go into great detail about the nature of the object. "Banana" is enough. And then there is shorthand, and abbreviations. Why write "The Institution of Engineering and Technology" when you can write the IET? a decrease in information sent of 93%.

Mathmatical expression is another form of compression. You could tediously write out the relationship between several parameters (which would soon get tiring), or simply write down an equation.

Edited: 19 October 2013 at 05:12 PM by Zuiko
 19 October 2013 07:28 PM
User is offline View Users Profile Print this message


Posts: 433
Joined: 15 April 2013


Great to know that we are in such agreement.

 02 November 2013 04:50 PM
User is offline View Users Profile Print this message


Posts: 409
Joined: 15 May 2002

Another way to achieve compression is to define the corner of polygons where the pixels are all the same colour, very effective in photos with lots of sky. Lossy compression defines the corners of polygons where the pixels are of very similar colours.
 13 November 2013 04:53 AM
User is offline View Users Profile Print this message


Posts: 433
Joined: 15 April 2013


Now you seem to be indulging in the international sport of translating goalposts although I must admit that you are pretty good, with the art of camouflage.

Writing IET or BBC or EU does not involve loss of anything because such abbreviations are possible only because they rely universally on agreed learning. If you wish to draw an analogy, I would suggest contractions such as won't or don't or can't ?certainly, there would be a loss of information should you write such things to pupils struggling to learn the language in general.

Ken Green

See Also:

FuseTalk Standard Edition v3.2 - © 1999-2016 FuseTalk Inc. All rights reserved.