The compiler translates the source code (the Pascal statements like writeln) of your program into machine language. That's called the object code version of your program. The machine language is just a series of numbers that the CPU recognizes as instructions.
When you run your program, the operating system reads in the object code and hands over control to those instructions.
The machine language instructions don't use named variables. Instead, they refer to locations in memory that the compiler has assigned to each variable. For example, your variable called "tax" may be stored in location 27. If your program adds 100 to tax, then the machine language instructions will get the current value from location 27, move it into the CPU, add 100, and move the result back to location 27.
Since different kinds of data (integers, real numbers, etc.) occupy different amounts of space in memory and require different kinds of CPU operations, the compiler has to know what type of data is in each variable.
Files are collections of data.
On mainframe computer systems, files are usually structured as collections of "records", which have a fixed length. For example, each employee might have a record that included 20 characters for the name, 50 for the address, and 8 for salary. Each of those would be called a "field" -- the name field, the address field, the salary field.
Structured files make it faster to find data in a large file. If you have a small "index file" that tells you employee "Smith" is described by record 2987, then you can just jump ahead 2986 record lengths from the beginning of the file, and read Smith's record.
On personal computers with fast hard disks and relatively small files, structured records are less important. Most of the data on a PC is just stored in unstructured format, as a "stream" of bits.
Every program is designed to accomplish a certain task. As you write the program, you need to consider that task as it relates to the interests of three "parties":
When talking about programming languages, the word SEMANTICS refers to the meaning of a word or statement or expression, without regard to exactly how it's spelled or written. "A plus B" and "A + B" mean the same thing. A programming language's semantics defines what concepts it can represent: addition, loops, branches, etc.
The word SYNTAX is used to describe the details of how things must be written to accurately represent the intended meaning. In Pascal, syntax covers the use of semicolons, the allowable length of variable names, the location of the VAR statement near the beginning of the program, and lots of other picky details that you wish your didn't have to deal with.
Good programming style will make it easier for you to read your own program as you develop it, and it will also make it easier for future programmers to work with the code you write. (The future programmer might be you a year from now!) Because the program is clearer and better organized, it's also much more likely to be correct.
Some important features of good style are:
The GOTO statement was the only way to branch and loop in early programs. The result was "spaghetti code", which was almost impossible to read and maintain. In most modern programming languages, including Pascal, there is almost never any reason to use the GOTO statement. You can do everything you need with while loops, for loops, if statements, and case statements.
The waterfall method of program development and design assumes you'll go through several stages: needs analysis, specification, design, implementation, testing, installation, maintenance. (The stages vary a bit depending on who's listing them.)
The main problem with the waterfall method is that it expects you to do everything right at each stage, and never go back to an earlier stage. For large systems, especially those that the user interacts with (like the ones we're writing), history has shown this just doesn't work. It's essential to develop the system part way, test it with users, then go back and redo earlier stages of the process. This is "iterative design".
Several studies over the last few years have shown that computers aren't doing much to improve productivity in a lot of businesses. It seems that computers are helping people to do more work, but a lot of the work isn't very useful. And just learning to use the computers takes too much time.
Where computers have succeeded, especially interactive computer systems (ones that the user operates through a keyboard and screen), there has been a strong focus on two principles for systems design:
Focus on users and tasks requires the system designer to talk to the users, find out who they are and what skills they have, and decide what the computer can actually help them do. Sounds obvious, but lots of systems are designed to do what managers think needs to be done, without every looking at what the employees actually need.
One technique to ensure task and user focus is: Identify and document actual tasks -- real jobs -- that users have done in the past. As you design the computer system, test it to see if it can do those specific tasks, and see if it's any faster or more accurate than the users' current approach. This can work along with a more traditional "requirements analysis" phase of design, which describes the tasks in abstract terms.
Iterative design is the process of repeatedly roughing out a system, evaluating how well it will work, and changing the design to fix the problems identified in the evaluation. You try to advance a bit on each cycle, moving from evaluation in your head, to evaluation with a group of experts, to evaluation with a few real users. But each iteration requires some rethinking of what's gone before.
Iteration -- repeated revisions -- are especially important with interactive systems because we know from experience that there will always be usability problems. Even systems designed by very good designers will produce unpleasant surprises when handed off to real users. Fortunately, you can catch huge percentages of the problems, typically 50 percent or more, by just testing the system with 3 to 5 users... and then revising it and, if possible, testing it again.
Lots of programming languages are very similar to Pascal.
Algol was designed by a committee in 1958, revised in '60 and '68.
Niklaus Wirth designed Pascal as a simpler version of Algol. It was especially intended for teaching, but it became popular for general use.
Ada, which is similar to Pascal but much more powerful, was developed for the US Department of Defense. It's intended for use on very large projects, where the program needs to be divided into many small parts that can be developed by different programmers and tested separately.
C was developed at Bell Labs by Kernigan and Ritchie, the guys who also invented Unix. They wanted a powerful language for writing operating systems, so C can do things with bits and types and hardware that make it easy for the programmer to get in trouble.
C++ adds object-oriented programming to C, and makes it stricter about types, more like Pascal.
Java is similar to C++, but intended for use with programs running on the Internet.
All these languages use slightly differing vocabularies and syntax (punctuation, word order), even when they mean the same thing. Pascal uses := for assignment, C use =. Pascal uses { } around comments, C uses /* */, and Java uses // at the beginning of the line. Pascal uses begin..end around compound statements, C uses { }. And so forth.
Most "real" programs are thousands of lines long, so they have to be organized into some kind of modules. This helps to make them
Programming languages provide various techniques for modularization, including subroutines (procedures and functions), abstract data types, separate compilation (Pascal units), and object oriented programming. We'll talk about these briefly during the second half of the semester.