To sue for COBOL

Perhaps more unexpectedly, on March 14 of this year, the GCC mailing list received an announcement about the release of the first COBOL front-end for the GCC compiler. For the uninitiated, COBOL first saw its release in 1959, making it one of the oldest programming languages ​​in 63 years that is still used regularly. The reason for its stability is its focus from the beginning as a transaction-based, domain-specific language (DSL).

Its acronym refers to general business-oriented language, which explicitly refers to the domain it is targeting. Even with the current COBOL 2014 standard, it’s still basically the same primarily transaction-based language, while adding support for structured, systematic, and object-oriented programming styles. Taking most of its core parts from Grace Hopper’s flow-matic language, it allows one to skillfully describe business logic as one would encounter a financial institution or business, in plain English.

Unlike the old GnuCOBOL project – which translates COBOL to C – the new GCC-COBOL front-end project eliminates that intermediate step and compiles COBOL source code directly into binary code. All of this raises the question of why an entire human-year was invested in this effort for a language that has probably been declared ‘dead’ for at least half its 63-year existence.

Does it make sense to learn or even use COBOL today? Do we need a new COBOL compiler?

Punch line found

An IBM 704 mainframe used at NACA in 1957.  (Credit: NASA)
An IBM 704 mainframe used at NACA in 1957. (Credit: NASA)

To fully understand where COBOL came from, we need to go back to the 1950s. Many years before minicomputers like the PDP-8, there was a time when home computers like the Apple i and Qin would not mind. These days dinosaurs are stuck in the depths of universities and businesses with increasingly transistorized mainframes and highly unequal system architectures.

Such differences existed even within a single manufacturer series of mainframes, for example IBM’s 700 and 7000 series. Since each mainframe had to be programmed for its intended purpose, usually for scientific or commercial purposes, this often means that software for a business or university will not run on new hardware without modifying or rewriting old mainframes, which adds significantly to the cost. .

Even before COBOL came on the scene, this issue was raised by BNF celebrity John W. Recognized by people like Bax, who in late 1953 proposed to his superiors at IBM the development of a practical alternative to assembly language. The FORTRAN scientific programming language, along with the LISP mathematical programming language, both primarily target the IBM 704 scientific mainframe.

FORTRAN and other high-level programming languages ​​offer two advantages over programming in mainframe assembly language: portability and efficient development. The latter provides a modular system that allows scientists and others to create their own programs as part of their research, primarily because of being able to use single statements in high-level language that translates to an optimized set of assembly instructions for hardware. , Study or other applications instead of learning the architecture of a particular mainframe.

A high-level language portability feature allows Fortran programs for scientists to share with others who can then run it in their institute’s mainframe, regardless of the mainframe’s system architecture and other hardware details. All that was needed was an available FORTRAN compiler.

UNIVAC I Operator Console at the Science Museum in Boston, USA.
UNIVAC I Operator Console at the Science Museum in Boston, USA.

Where FORTRAN and LISP focused on simplifying programming in the scientific domain, businesses had very different needs. Businesses follow the rules set by the tax office and other official instances, working on strict sets of rules that must be followed to convert inputs such as transactions and revenue flows into pay-rolls and quarterly statements. Transforming those written business rules into something that works exactly the same way in a mainframe was a major challenge. It was here that Grace Hopper’s flow-matic language, formerly Business Language 0, or B-0, provided a solution that targeted UNIVAC I, the world’s first dedicated business computer.

Grace Hopper’s experiences indicate that the use of common English words was much preferred by business over symbols and mathematical notation. Miss Hopper’s role as technical adviser to the CODASYL Committee, which developed the first COBOL standard, was a recognition of both the success of Flow-Matic and Miss Hopper’s expertise in this area. As he later stated in a 1980 interview, COBOL 60 is 95% flow-matic. The other 5% came from competing languages ​​- such as IBM’s COMTRAN language – which had the same idea, but a very different implementation.

Interestingly, a feature of COBOL prior to the 2002 standard was its column-based coding style, derived from the use of 80-column punch cards. This brings us many feature updates to the COBOL standard over the decades.

The value of their time

An IBM COBOL coding form from the 1960s.
An IBM COBOL coding form from the 1960s.

One interesting aspect of domain-specific languages ​​in particular is that they reflect both the state of the technology as well as the domain spoken at the time. When COBOL was used in the 1960’s, programming was not done directly on computer systems, but usually through code given in the mainframe in the form of punch cards, or, if you’re lucky, magnetic tape. In the 1960s, this meant that ‘running a program’ involved handing over a punched card or special coding form to the mainframe quarrelsome people, who would run the program for you and give you back the results.

These intermediate steps imply additional complexity when developing new COBOL programs, and the column-based style was the only option with the COBOL-85 update. However, with the next standard update in 2002, many changes were made, including the elimination of column-based alignment, and the adoption of free-form code. This update also adds object-oriented programming and other features, including previously limited string and more data types in numerical data representations.

What remained unchanged was COBOL’s lack of code blocks. Instead the COBOL source is divided into four sections:

  • Identification department
  • Department of Environment
  • Data section
  • Methods section

The Identification section specifies the name and meta information about the program in addition to the class and interface specification. The Environment Division specifies the features of any program depending on the system it runs on, such as the file and character set. The data section is used to declare variables and parameters. The Procedures section contains a statement of the program. Finally, each section is subdivided into sections, each of which is made up of paragraphs.

An IBM z14 mainframe from 2017, based on the IBM z / architecture CISC ISA.
An IBM z14 mainframe from 2017, based on the IBM z / architecture CISC ISA.

With the latest COBOL update of 2014, the floating point type format was changed to IEEE 754, to further enhance its interoperability with the data format. However, Charles R. As Martin noted in The Overflow in his difficult COBOL introduction, COBOL would be an accurate comparison with other domain-specific languages, such as SQL (introduced 1974). One can add to that comparison like PostScript, Fortran or Lisp.

While it is technically possible to use SQL and PostScript for regular programming and to mimic the features of DSL in generic (system) programming language, this is not a quick or efficient use of time. All of which rather depict the pronunciation To be For these DSLs: Programming as efficiently and directly as possible within a specific domain.

This point is rather briefly illustrated by IBM’s Program Language One (PL / I) – introduced in 1964 – a common programming language intended to compete with everything from FORTRAN to COBOL, but ultimately none of them. Failed to pass. , Neither FORTRAN nor COBOL programmers can be assured of PL / I qualification.

It is important to understand that you will not write operating system or word processor in any of these DSLs. This lack of generosity both reduces their complexity, and that is why we should judge them by their merits as a DSL for their intended domain.

The right tool

An interesting aspect of COBOL is that the committee that created it was not made up of computer scientists, but of people from the business community, strongly influenced by the needs of manufacturers such as IBM, RCA, Sylvania, General Electric, Philco, and so on. The National Cash Register, for which business owners and government agencies with whom they have done business had a good experience.

As a result, the need to define database queries and related skills efficiently has shaped SQL as well as the need to streamline business transactions and management over decades. Even today most of the banking and stock trading in the world is governed by the mainframe code written in COBOL, mainly due to decades of language refinement to eliminate ambiguity and other problems which can lead to very costly bugs.

Attempts to port business applications written to COBOL have shown that the problem with moving statements from a DSL to plain language is that the latter has nothing to do with speculation, protection, and features, which is why DSL was created in the first place. . The more generic a language is, the more unintended consequences a statement can have, meaning that instead of verbally porting a COBOL or FORTRAN (or SQL) statement, you need to keep in mind all the checks, limitations, and security. Original languages ​​and their transcripts.

Ultimately, any attempt to port this type of code to a generic language will inevitably result in DSL being copied to the target language, although bugs are more likely to occur for a variety of reasons. Which means that when a generic programming language can implement the same functionality as those DSLs, the real question is whether it is desirable at all. Especially when the cost of downtime and errors is measured in millions of dollars per second, as in a country’s financial system.

The attraction of a DSL here is that it avoids many potential corner cases and problems without implementing features that enable these problems.

Where GCC-COBOL fits

Despite strong demand, there is an acute shortage of COBOL developers. Although GCC-COBOL is – like GnuCOBOL – not a formally valid compiler to be adopted anywhere near the IBM z / OS-powered mainframe at a financial institution, it does play an invaluable role in enabling easy access to the COBOL toolchain. It then enables hobbies and students to develop into COBOL, whether for fun or for a potential career.

Without investing in a proprietary toolchain and associated ecosystem, a business can also use such an open-source toolchain to replace legacy Java or similar pay processing applications with COBOL. According to the developer behind GCC-COBOL in announcing the mailing list, this is a goal: to enable mainframe COBOL applications to run on Linux systems.

While financial institutions are still more likely to jump on the IBM Z system’s mainframe (‘Z’ stands for ‘Zero Downtime’) and the corresponding Bulletproof Service Agreement, it is good to see such an important DSL becoming more accessible to all. No strings attached.

Leave a Reply

Your email address will not be published.