The EBCDIC Format: A Multipart Series

For a developer who has only written software for only server and client systems it is easy to see the monolithic mainframe systems with a bit of a legacy mindset. Whether or not the legacy bit is actually true the fact remains that depending on what industry you find yourself writing software solutions for there are a lot of mainframe systems out there that we must interface with, at least with the output of these systems.

By and large, the output from these systmes will be in a binary format referred to as EBCDIC, which is an acronym for a term created by IBM that is Extended Binary Coded Decimal Interchange Code. If there was only uncompressed (also known in EBCDIC-land as unpacked) plain text fields with no special features like field Redefines or Occurs, the translation would be as easy as flipping the code page to ASCII to translates the bytes.

However, that is the rarely the case when dealing with an EBCDIC file and as with most problems in computer science, the devil is in the details.

This multipart series will break apart each of the different facets that pose a problem for the conversion of EBCDIC data into ASCII so that it might be of use on PC based systems.

EBCDIC is a curious format that in modern world seems bizarre and out of date. However, you will still large data sets in some very high revenue industries shipped around in EBCDIC. There are several good sites on the web that deal with the background and basics of COBOL and EBCDIC file formats so I won't repeat their content here but rather provide linkage in case you are interested.

I will however, explore three major topics and how they are problems when trying to just convert the encoded text found in a raw EBCDIC file.