1). 重写提供输入文件的输出程序。它可以直接输出big-endian字节流DataOutputStream或者字符DataOutputSream格式。
2). 写一个独立的翻译程序,读和排列字节。可以用任何语言编写。
3). 以字节形式读数据,并重新安排它们(on the fly)。
4). 最简单的方式是,使用我编写的LEDataInputStream, LEDataOutputStream 和LERandomAccessFile模拟 DataInputStream, DataOutputStream and RandomAccessFile ,它们使用的是little-endian字节流。 You can read about LEDataStream. You can download the code and source free. You can get help from the File I/O Amanuensis to show you how to use the classes. Just tell it you have little-endian binary data.
2.你可能甚至不会有任何问题。从C来的许多Java新手可能会认为需要考虑它们所依赖的平台内部所使用的是big还是little问题。在Java中这不是一个问题。进一步,不借助于本地类,你无法知道它们是如何存储的。Java has no struct I/O and no unions or any of the other endian-sensitive language constructs.
仅在与遗留的C/C++应用程序通讯时需要考虑endian问题。下列代码在big or little endian机器上都将产生同样的结果:
// take 16-bit short apart into two 8-bit bytes.
short x = 0xabcd;
byte high = (byte) (x >>> 8);
byte low = (byte) x;/* cast implies & 0xff */
System.out.println ("x=" + x + " high=" + high + " low=" + low );
The most common problem is dealing with files stored in little-endian format.
I had to implement routines parallel to those in which reads raw binary, in my LEDataInputStream and LEDataOutputStream classes. Don't confuse this with the io.DataInput human-readable character-based file-interchange format.
If you wanted to do it yourself, without the overhead of the full LEDataInputStream and LEDataOutputStream classes, here is the basic technique:
Presuming your integers are in 2's complement little-endian format, shorts are pretty easy to handle:
short readShortLittleEndian( )
// 2 bytes
int low = readByte() & 0xff;
int high = readByte() & 0xff;
return (short )(high << 8 | low);
Or if you want to get clever and puzzle your readers, you can avoid one mask since the high bits will later be shaved off by conversion back to short.
short readShortLittleEndian( )
// 2 bytes
int low = readByte() & 0xff;
int high = readByte();
// avoid masking here
return (short )(high << 8 | low);
Longs are a little more complicated:
long readLongLittleEndian( )
// 8 bytes
long accum = 0;
for ( int shiftBy = 0; shiftBy < 64; shiftBy+ =8 )
// must cast to long or shift done modulo 32
accum |= ( long)(readByte () & 0xff) << shiftBy;
return accum;
In a similar way we handle char and int.
char readCharLittleEndian( )
// 2 bytes
int low = readByte() & 0xff;
int high = readByte();
return (char )(high << 8 | low);
int readIntLittleEndian( )
// 4 bytes
int accum = 0;
for ( int shiftBy = 0; shiftBy < 32; shiftBy+ =8 )
accum |= (readByte () & 0xff) << shiftBy;
return accum;
Floating point is a little trickier. Presuming your data is in IEEE little-endian format, you need something like this:
double readDoubleLittleEndian( )
long accum = 0;
for ( int shiftBy = 0; shiftBy < 64; shiftBy+ =8 )
// must cast to long or shift done modulo 32
accum |= ( (long)(readByte() & 0xff)) << shiftBy;
return Double.longBitsToDouble (accum);
float readFloatLittleEndian( )
int accum = 0;
for ( int shiftBy = 0; shiftBy < 32; shiftBy+ =8 )
accum |= (readByte () & 0xff) << shiftBy;
return Float.intBitsToFloat (accum);
You don't need a readByteLittleEndian since the code would be identical to readByte, though you might create one just for consistency:
byte readByteLittleEndian( )
// 1 byte
return readByte();
In Gulliver's travels the Lilliputians liked to break their eggs on the small end and the Blefuscudians on the big end. They fought wars over this. There is a computer analogy. Should numbers be stored most or least significant byte first? This is sometimes referred to as byte sex.
Those in the big-endian camp (most significant byte stored first) include the Java VM virtual computer, the Java binary file format, the IBM 360 and follow-on mainframes such as the 390, and the Motorola 68K and most mainframes. The Power PC is endian-agnostic.
Blefuscudians (big-endians) assert this is the way God intended integers to be stored, most important part first. At an assembler level fields of mixed positive integers and text can be sorted as if it were one big text field key. Real programmers read hex dumps, and big-endian is a lot easier to comprehend.
In the little-endian camp (least significant byte first) are the Intel 8080, 8086, 80286, Pentium and follow ons and the AMD 6502 popularised by the Apple ][.
Lilliputians (little-endians) assert that putting the low order part first is more natural because when you do arithmetic manually, you start at the least significant part and work toward the most significant part. This ordering makes writing multi-precision arithmetic easier since you work up not down. It made implementing 8-bit microprocessors easier. At the assembler level (not in Java) it also lets you cheat and pass addresses of a 32-bit positive ints to a routine expecting only a 16-bit parameter and still have it work. Real programmers read hex dumps, and little-endian is more of a stimulating challenge.
If a machine is word addressable, with no finer addressing supported, the concept of endianness means nothing since words are fetched from RAM in parallel, both ends first.
5.What Sex Is Your CPU?Byte Sex Endianness of CPUs
AMD 6502, Duron, Athlon, Thunderird
6502 was used in the Apple ][, the Duron, Athlon and Thunderbird in Windows 95/08/ME/NT/2000/XP
Apple ][ 6502
Apple Mac 68000
Uses Motorola 68000
Apple Power PC
CPU is bisexual but stays big in the Mac OS.
Burroughs 1700, 1800, 1900
bit addressable. Used different interpreter firmware instruction sets for each language.
Burroughs 7800
Algol machine
word-addressable only, hence no endianness
31½ bit words. Low order bit must be 0 on the drum, but can be 1 in the accumulator.
CDC 3300, 6600
IBM 360, 370, 380, 390
IBM 7044, 7090
word addressable
36 bits
IBM AS-400
Power PC
The endian-agnostic Power-PC's have a foot in both camps. They are bisexual, but the OS usually imposes one convention or the other. e.g. Mac PowerPCs are big-endian.
Intel 8080, 8080, 8086, 80286, 80386, 80486, Pentium I, II, III, IV
Chips used in PCs
Intel 8051
MIPS R4000, R5000, R10000
Used in Silcon Graphics IRIX.
Motorola 6800, 6809, 680x0, 68HC11
Early Macs used the 68000. Amiga.
NCR 8500
NCR Century
Sun Sparc and UltraSparc
Sun's Solaris. Normally used as big-endian, but also has support for operating for little-endian mode, including being able to switch endianness under program control for particular loads and stores.
Univac 1100
36-bit words.
Univac 90/30
IBM 370 clone
Zilog Z80
Used in CPM machines.
If you know the endianness of other CPUs/OSes/platforms please email me at
In theory data can have two different byte sexes but CPUs can have four. Let us give thanks, in this world of mixed left and right hand drive, that there are not real CPUs with all four sexes to contend with.
The Four Possible Byte Sexes for CPUS
Which Byte
Is Stored in the
Which Byte
Is Addressed?
Intel, AMD, Power PC, DEC.
none that I know of.
Perhaps one of the old word mark architecture machines.
Mac, IBM 390, Power PC
You are visitor number 8680.
You can get an updated copy of this page from