Jump to content
Science Forums

Recommended Posts

Posted

Most of the scholarly sorts who frequent hypography know already that when one speaks of “1 MB” of storage, one doesn’t mean 10[math]^6[/math] = 1,000,000 bytes, but 2[math]^{20}[/math] = 1,048,576 bytes. Fewer, I suspect, are aware that that, to address the issue of this common but technically imprecise misuse of SI prefixes (Kilo, Mega, Giga, etc.), in 1998, the IEC (with little argument the most important standards organization in the world) published in IEC 60027-2 a set of standard prefixes for powers of 2 – Kibi, Mebi, Gibi, etc. (for a description of the problem and solution straight from the IEC’s website, see IEC - IEC in action | SI zone > Prefixes for binary multiples)

 

Despite how easy it is to speak, spell and abbreviate – just replace the last syllable of the nearest SI decimal prefix with “bee”, spelled “bi” (eg: Mega becomes Mebi), and append a lower case “i” after the one-letter SI abbreviation (eg: MB becomes MiB), this binary prefix scheme appears to me to be almost completely ignored, even among electrotechnical professionals. After a year or so of diligently saying kibi this and mebi that in every possible conversation (prompting, I suspect, many to wisper behind my back “that Craig seems a bright fellow, but what’s with that speech impediment whenever he talks about kilobytes and megabytes?”), and enduring my spell-checkers’ determined attempts to correct my use of “KiB”, “MiB”, “GiB”, etc. in documents, I surrendered to conformity, and went back to using the usual pronunciation and abbreviations.

 

What are your experiences with binary prefixes and society? Why do you suppose even the most intrepid techno-geeks fear to speak them in public? Will kibis and gibis ever find their way into popular usage? Please share your thoughts.

Posted

I had already seen it, but thought it was a matter of language as different languages are on my computer (ie. something in one, something else in another). Also I knew that a kilobyte is not thousend but 1024 etc.,but never heard that there is a nother definition.

 

About if that's ever going to change, I don't know and if it does most probably you in the US will be the last to adopt it, because you still use "feet, etc." even if the SI is meters and this even when the one who invented the feet (UK) passed to use the meter (or at least both)...

Posted
Fewer, I suspect, are aware that that, to address the issue of this common but technically imprecise misuse of SI prefixes (Kilo, Mega, Giga, etc.), in 1998, the IEC (with little argument the most important standards organization in the world) published in IEC 60027-2 a set of standard prefixes for powers of 2 – Kibi, Mebi, Gibi, etc. (for a description of the problem and solution straight from the IEC’s website, see IEC - IEC in action | SI zone > Prefixes for binary multiples)

 

I question the value of this endeavor. Why have a set of special prefixes for binary if you are not going to have them for ternary, quaternary, quinary, octal, hexadecimal, sexagesimal or any other base you may wish to use? As many words have a variety of definitions dependant on their use it seems the definitions of these prefixes should be treated the same.

Posted

I had never heard of the KiB etc before, interesting ;)

 

Though I was always aware that the definition was askew since when you buy a new 250Gb HD and get it going windows says it has 232GB. I dont know if the capital B and little b have anything to do with it..

Posted
I question the value of this endeavor.
To clarify, the “kibi/megi/gibi” standard has been in place for nearly a decade, but is little used. The endeavor, then, is its use, not its creation.
Why have a set of special prefixes for binary if you are not going to have them for ternary, quaternary, quinary, octal, hexadecimal, sexagesimal or any other base you may wish to use? As many words have a variety of definitions dependant on their use it seems the definitions of these prefixes should be treated the same.
All the power of two bases – quaternary, octal, hexadecimal, etc – are addressed by a base 2 unit standard. Others – ternary, quinary, sexagesimal, etc. – are, in my experience, so little used as to not require a standard. Seriously, outside of a few odd calendars, has anyone used sexagesimal since ancient Sumerian cuneiform was a standard, some 4000 years ago? :)

 

A couple of reasons to support the IEC 60027-2 standard come to mind:

  • Base 2 is a special, canonic base. There is no integer number system base smaller. Already, in the technical literature of many disciplines, powers of 2 are standard for describing large finite cardinalities (eg: “about [math]2^{34}[/math] states”), although when writing for a popular audience, decimal number equivalents are usually used.
  • Technical confusion exists. Some consequential miscommunication occurs when one person uses an SI prefix in its decimal sense, and another in its binary. For example, if I were to write a requirements document stating “This program will require 3.2 TB of storage”, meaning [math]3.2*2^{40}[/math] bytes, and an engineer provide me with what seems a generously padded [math]3.5*10^{12}[/math] bytes, I'd still not have the requested amount. A single instance of this sort of miscommunication can cost tens of kilo-dollars in project cost overruns, plus less tangible costs not limited to mutual name-calling ;).
  • Truth in advertising issues exist. Manufacturers of storage devices tend to advertise capacities using decimal-sensed prefixes. Software venders tend to advertise storage requirements using binary-sensed prefixes. As commonly available storage devices exceed a TebiByte is capacity, this amount to about a 10% exaggeration.
  • It’ll get worse. As noted above, as common numbers and capacities become larger, the discrepancy ratio between binary and decimal-sensed prefixes becomes larger

and at least one significant reason not to:

  • The K, M, G, etc. prefixes usually appear in their binary sense before the unit abbreviation B. The problem might be better solved by requiring byte units to always be in the binary sense.

For myself, I think I'll resume using the cutish IEC's binary prefixes in speech and writing. Who knows. I might start a trend. At least I'll no longer be a slave to peer pressure. :protest:

Posted
Others – ternary, quinary, sexagesimal, etc. – are, in my experience, so little used as to not require a standard.

 

I would not discount ternary to quickly. At one time I invested quite a bit of research into the use of ternary digital systems for data storage. Byte for byte yielded much greater storage capacity for media that could store bits as negative, neutral or positive values. In my own preparation to patent this application my patent search turned up 4 existing patents.

 

It was my intent to also explore binary negative logic and positive logic systems combined as a single ternary computational system but my patent search regarding storage turned up 3 existing patents on this idea as well.

 

To date I have witnessed no real world applications of the methods covered by these patents but somebody must have had a reason for patenting them.

Posted
Though I was always aware that the definition was askew since when you buy a new 250Gb HD and get it going windows says it has 232GB.
[math]232 \cdot 2^{30} = 249.108103168 \cdot 10^9[/math], so the labeling may just be based on a “round any amount up” rule.

 

I’ve had salesfolk try to explain this off by saying the difference was due to “formatting data” on the disk. While a disk may in principle have significantly less usable space than physical, this is very rare now.

 

In the 1980s where 270 MiB was considered a large disk pack, mixing formatting and usuable data on a disk was a common practice. Such disks required ”low level” formatting before data could be read or written to them – this formatting was typically done by software utilities which varied from hardware to hardware – a disk pack low-level formatted for a Data General machine, for example, wouldn’t work on a Digital Equipment Corporation – though all PCs with similar BIOSs could usually exchange formatted hard drives with little difficulty.

 

Almost all modern (1990+) disk drives have their low-level formatting done at the factory. Typically, one complete platter face is used only for this positioning data, and the head that reads it isn’t even capable of rewriting it. Even the most egregious marketing wonks dare not include this face in the advertised capacity. So, when geeks brag of “low-level reformatting” a 300 GiB hard drive, they’re usually exager-lying. ;)

 

Floppy disks (remember those?) still use end user rewritable low-level formatting.

I dont know if the capital B and little b have anything to do with it..
I’d not noticed any significance to the case of the “B”s and “b”s in packaging and literature, but after Jay’s mention of it, will keep an eye open. Perhaps there is a competing binary units standard out there, even less known that the IEC’s. :QuestionM
Posted

I have been bothered by this for a while, so here i go with a rant :P

 

I've known that a megabyte is ~1,024,000 bytes, but I still use megabyte, I've never heard of these before.

 

Actually you are very mistaken, so get ready for a lecture, and it all starts with bits:

 

A bit is a binary digit represented by either a (1) or a (0)

 

When in 1956 Dr. Buchholz was designing one of the first IBM computers, his computer instruction set, defined it as a 4bit byte-size field, and allowed i think up to 16 bits (i am certain that you can look this up on wiki), and if i am not mistaken it was pushed to a 2 or 3 bit byte-size field in their production model. In any case System/360 was the first to standardize a byte to be 8 bits, because it can represent 256 characters defined in the western world (ever see ascii table?).

As an interesting side note, schools in UK (and i remember this from my 5th grade informatics class), used to teach that a byte is a "bit (or binary) tuple" (ex: septuple, octuple...)

Also byte consists of 2 smaller units, known well in assembly, called Nybbles, a nybble (or nibble) is 4 bits. Byte is also the smallest unit of information storage and represents a letter(or symbol) and has also become the standard for memory accounting.

 

So what do all those kilo, mega, kibi, mibi and so forth mean?

well while they are close, they are not the same value, and while we mean to say one, we really mean another (needless to say that the ISA will never approve the official use of the prefix kibi and so forth)

 

Kilo, means a thousand, a unit of measure of [math]10^{3}=1000[/math] (of anything). But the confusion comes to the computer world because there can not be a 1000 byte integer, because the computer operates in a binary system (think back to bits), therefore, all units and measures of storage are to be represented in binary or a power of 2. Closest in this case [math]2^{10}=1024[/math]. So when referring to a kilobyte in computing you really mean 1024, but that contradicts what most (non computer-literate people) people (and in the interest of not mixing the 2 types of kilobytes, the term kibibyte was created).

 

Mega, means one million and is a unit of measure of [math]10^{6}=1 000 000[/math]. But in computer world, it equals [math]2^{20} = 1 048 576[/math], and is substituted by kibibyte.

 

Giga, (propperly pronounced "Jigga", as in gigantic, but more and more commonly mispronounced with a harder G, as in girl). Measure of [math] 10^{9}= 1 000 000 000[/math] and in computational world it equals[math]2^{30}=1073741824[/math], and is substituted in as a gibibyte.

 

Hope the lecture helps. Oh and one more thing, on when to use a big or a small b or B

 

B is a Byte, b is a bit.

 

now to kilo Mega Giga and so forth (k and K in kilo are interchangeable (units-wise), but more correctly it is a small k vs a large K)

 

kB - kilobyte

kb - kilobit

KiB - kibibyte

Kib - kibibit

Mb - megabit

MiB - mibibyte

GiB - gibibyte

 

Almost all modern (1990+) disk drives have their low-level formatting done at the factory.

actually what you think and what is actually done are two different things, and that is due to different definitions of low-level format. They zero out the drive at the factory, low level format actually implies generating and writing random data to the hard-drive, and there are plenty of tools, some do it better then the others (that is what i do for a living for now... :( job opening anyone? )

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...