C, not C++
• The code for things like
C
If you ever get to work on things like the code for Linux, you are more likely to use C than C++
–
–
–
–
Linux
Apache
GNU software (compilers etc)
Shells (bash, csh, ksh, …)
are in C not C++.
• Why?
– Mainly historical
– Partly
• Preference by the majority of programmers who work on such code
• Issues with C++ “code bloat” etc
• If you need to read or work with the code of these applications you need to be familiar with C’s idioms, and its differences from C++.
Linus is one of those who doesn’t feel enthusiastic about C++
On Wed, 5 Sep 2007, Dmitry Kakurin wrote: > When I first looked at Git source code two things struck me as odd: > 1. Pure C as opposed to C++. No idea why. Please don't talk about portability, it's BS. *YOU* are full of bullshit. C++ is a horrible language. It's made more horrible by the fact that a lot of substandard programmers use it, to the point where it's much much easier to generate total and utter crap with it. Quite frankly, even if the choice of C were to do *nothing* but keep the C++ programmers out, that in itself would be a huge reason to use C. In other words: the choice of C is the only sane choice. I know Miles Bader jokingly said "to piss you off", but it's actually true. I've come to the conclusion that any programmer that would prefer the project to be in C++ over C is likely a programmer that I really *would* prefer to piss off, so that he doesn't come and screw up any project I'm involved with. C++ leads to really really bad design choices. You invariably start using the "nice" library features of the language like STL and Boost and other total and utter crap, that may "help" you program, but causes: ‐ infinite amounts of pain when they don't work (and anybody who tells me that STL and especially Boost are stable and portable is just so full of BS that it's not even funny) ‐ inefficient abstracted programming models where two years down the road you notice that some abstraction wasn't very efficient, but now all your code depends on all the nice object models around it, and you cannot fix it without rewriting your app. In other words, the only way to do good, efficient, and system‐level and portable C++ ends up to limit yourself to all the things that are basically available in C. And limiting your project to C means that people don't screw that up, and also means that you get a lot of programmers that do actually understand low‐level issues and don't screw things up with any idiotic "object model" crap. So I'm sorry, but for something like git, where efficiency was a primary objective, the "advantages" of C++ is just a huge mistake. The fact that we also piss off people who cannot see that is just a big additional advantage. If you want a VCS that is written in C++, go play with Monotone. Really. They use a "real database". They use "nice object‐oriented libraries". They use "nice C++ abstractions". And quite frankly, as a result of all these design decisions that sound so appealing to some CS people, the end result is a horrible and unmaintainable mess. But I'm sure you'd like it more than git. Linus
A side note!
C
• This small segment of CSCI131 will illustrate
– Some features of C
• Particularly those where C seems confusingly different from the C++ that you are studying in CSCI114 and CSCI124
– Some of the C libraries
• E.g. stdio instead of C++’s iostream
A side note!
NetBeans IDE
• All illustrative examples will have been created using the NetBeans IDE
Use an
Integrated Development Environment
IDEs enhance your productivity as programmers!
nabg
– This IDE allows you to create
•
•
•
•
•
Java programs
PHP programs
Javascript programs
C++ programs
C programs
– NetBeans itself is written in Java and runs on Linux and Windows
• There are instructions at netbeans.org that tell you how to set it up on Windows using cygwin
or mingw (or Microsoft’s C/C++ tools) for C/C++ projects
1
A side note!
Netbeans tutorials on the web
• There are tutorials at netbeans.org and on youtube.
It will take you a couple of hours to learn how to use the IDE.
You will become much more productive in your subsequent development work, because the IDE can help
– It checks the syntax of your code as you type it in and highlights errors (e.g. missed parentheses, missing quote, comma or semi‐colon)
– It has built in details of the standard libraries – and can prompt you when you are writing code calling a standard function (showing argument list etc)
– It understands “makefile” syntax a lot better than you and looks after these build files
– It speeds the edit‐compile‐edit‐compile‐edit‐run‐edit‐compile‐run cycle
A side note!
NetBeans – use in other subjects
• NetBeans is used
–
–
–
–
• NetBeans used to be used in CSCI213 for Java but the current lecturer uses Eclipse on his own computer, and so requires the use of that IDE for this subject (so you have to learn both)
– Both Eclipse and NetBeans are free – download from Internet
•
•
Some students prefer alternative open‐source or proprietary development environments – but these won’t be deployed in the lab.
The Microsoft Visual studio products are excellent development environments for those developing Windows specific code – but these products aren’t currently used in any CS subjects.
–
A side note!
CSCI110 – PHP projects
CSCI222 – C++ projects and C++ unit testing
CSCI399 – Java web and PHP/Zend projects
CSCI398 – Enterprise Java projects
Microsoft offers free “developer editions” of its Visual Studio projects – these are fine for student use, they omit some ‘enterprise level’ features like load testing
NetBeans on Windows
• Many of the screenshots will show
– NetBeans 7.3 on Windows XP
– Cygwin as the Windows hosted Unix subset
• This configuration was used because it was convenient for capturing screenshots and pasting into a Windows PowerPoint file.
• Labs will use Linux with a more recent version of NetBeans
•
If you want to keep your computer as a Windows machine (presumably for games etc) then
1.
Best option is make it dual boot with a version of Linux and install the gnu development tools on your Linux along with Java and NetBeans from Oracle
2.
C
Not quite the C++ that you know
(and love?)
C++ is not a superset of C.
Not so good – install Linux as a virtual machine within your main Windows system and install same development software in this virtual machine
3.
Less good choice – only really appropriate when needing to get screen shots for PowerPoint
1.
2.
3.
Install Java and NetBeans from Oracle on Windows
Install either cygwin or mingw on your Windows machine Configure your environment variables (full instructions at netbeans.org) so that NetBeans can find the cygwin/mingw
setup
C
• For a start, you don’t have –
– Classes
– Templates
– Exceptions
– Nor
• Type secure I/O with the iostream library
– Compiler checking code using iostream library can determine data types and avoid errors that can occur when a careless programmer uses the stdio library
•
•
•
•
•
New and delete operators
Pass by reference
bool as a standard type
STL data types like string or vector
…
C
• You do have –
– Lots more explicit manipulation of pointers
• When in C++ you pass arguments by reference, you are actually passing pointers –
and the code of the function with reference arguments is actually dereferencing these pointers; C++’s “pass by reference” is just a little “syntactic sugar” to make pointer use more palatable
• In C, you are explicit in your use of pointers
– Beginners often get confused with the syntax of expressions involving dereferencing of pointers
– Lots more use of the “address of” operator &
• When invoking functions requiring “call by reference”, you have to make clear that you are passing the address of a variable and not its value
– More #define macros and lots of typedefs
• In C, typically define all your constants in macros
– Less flexibility as to where you can define local variables
– ….
Of course, 100‐level students haven’t met templates and exceptions and have had only the most rudimentary introduction to classes –
so you won’t miss these (and they are the really good features of C++).
nabg
2
C –
2 trivial differences from the C++ you know
• Comments – You were limited to
/* helpful explanatory comment */
(traditionally, there were no “line comments” using //)
Demos 1
But it appears that // comments have been retrofitted (or at least gcc now accepts them whether or not they are in standard)
• Local variable declarations –
– All local variables of a block (function or nested block) must be declared at the start of that block, Mostly stdio & stdlib
and in particular for(‐‐‐;‐‐‐;‐‐‐) cannot be used to declare a local scope variable as you are used to in C++ (e.g. for(int i=0;i<10;i++) is illegal in C)
Demo 1
Demo 1 project
• NetBeans generated project
• Create a C project …
Choose C/C++ Application
– Has a main.c generated, along with a makefile (and other project related housekeeping files)
Pick C (not C++, and
definitely not FORTRAN)
Header files
• C and C++ use header files in the same way
– Sometimes there are slightly different versions of the same header file for C and for C++;
more often, there are conditional compilation directives in the header files that separate any specifically C related material from any specifically C++ material
– NetBeans assumes that project will use stdio
and stdlib – so inserts
the #include directives on these header files
You can view the header files …
• Mouse over the #include header file, and right click
– C’s header files all have names like stdio.h
• C++ specific headers vary – you may get .H, .hpp or simply no extension on the file name
• Role of a header file:
– Declare signatures of functions that are in a library file that can be linked with your programs code
• Also, declare types, #define constants, typedefs, and occasionally global variables
• This is sometimes an easy way of checking what functions are defined in a particular header file,
but …
– Needed so that the C/C++ compiler can check that any calls that your code makes to these library functions include the necessary arguments
If you ever need to find an actual header file for some reason, try looking in
/usr/include or /usr/local/include; there may be subdirectories with C++ specific
headers.
nabg
3
stdio.h has declarations for the
functions:
fopen, fclose, fflush, tmpnam, …
fgetc, fputc, fgets, …
Header files are often hard to read!
• Most modern header files are complex
– Header files tend to depend on other header files so there may be a series of #include statements bringing in other function and MACRO definitions
– Macros are commonly used
– There will be numerous conditionally compiled sections expressing things like
If compiling ANSII C then declare the functions like this … else if compiling C++ then declare the functions like this …
If compiling for Cygwin use this specific modified version
_EXFUN( name, parameter list) is macro that changes how a function is actually declared; it will arrange the declarations to suit the environment –
e.g. old C function declarations were like this (arguments listed after () parens!)
char* fgets()
Modern C function declaration would be
char* str; char* fgets(char* str, int num, FILE* file);
int num;
FILE* file;
Function documentation
man
• C (and most C++) documentation still uses the old “man” page listings • man (short for manual) is the Unix/Gnu/Linux utility that displays manual pages
– Illustration of use in cygwin console:
man fgets
– It pages through the documentation
SYNOPSIS
#include <stdio.h>
char *fgets(char *BUF, int N, FILE *FP);
#include <stdio.h>
char *_fgets_r(struct _reent *PTR, char *BUF, int N, FILE *FP);
DESCRIPTION
Reads at most N‐1 characters from FP until a newline is found. The characters including to the newline are stored in BUF. The .
buffer is terminated with a 0
A man page in a cygwin terminal on Windows
man pages
• Text with markup for the nroff formatter (one of the programs in Unix version 1)!
– Why not convert to HTML and make them easier to read, and a lot easier to follow the cross references that exist in most man pages?
Tradition!
man is the system that all past Unix/Linux programmers have used –
the next generation of programmers must follow this tradition.
fgets etc
• Anyway, I now know that I can use an fgets function in my Demo1 program and it will be found when the stdio library file is linked to my code.
• What’s this?
– It is a #define constant
• Included in stdlib
(where also find functions like atoi)
• Why does program return an int value?
– For shell scripting!
Markup tags
.SH like HTML’s <h2>
.PP like HTML’s <p>
• You can write a script that will run a sequence
of programs; script can have loops, and can have
conditionals. A shell script may want to test results returned by programs – results that indicate
if all is well (and script to continue) or something
went wrong
Again, if you ever want to find the files, try /usr/man
nabg
4
Demo1
Demo1 ‐ code
• Begin by reading and writing strings
– gets and puts
SYNOPSIS
#include <stdio.h>
char *gets(char *BUF);
DESCRIPTION
Reads characters from standard input until a newline is found. The characters up to the newline are stored in BUF. The newline is discarded, and the buffer is terminated with a 0.
This is a _dangerous_ function, as it has no way of checking the amount of space available in BUF. One of the attacks used by the Internet Worm of 1988 used this to overrun a buffer allocated on the stack of the finger daemon and overwrite the return address, causing the daemon to execute code downloaded into it over the connection.
SYNOPSIS
#include <stdio.h>
int puts(const char *S);
DESCRIPTION
`puts' writes the string at S (followed by a newline, instead of the trailing null) to the standard output stream.
• Enter code in NetBeans editor • In Projects pane, select project and right‐click select build; NetBeans will display steps in makefile
– gcc runs first as compiler, then as link loader
Linking with standard C libraries to produce
executable file demo1
Demo1
• NetBeans conventions
1. Builds a “debug” version of the code
•
Compiling main.c to main.o
Demo 1 – it runs
• Select project, right‐click and select Run
This has symbol table information (names
of functions etc) kept to simplify debugging
2. Creates the .o files (compiled, linkable
files) in a build subdirectory
•
Has Debug/Release subdirectory
–
Another “platform” subdirectory
»
.o files go here
3. Creates a dist (distribution) directory (again with version and platform subdirectories)
•
Linked .exe file goes here
Demo1 – lots of Unix conventions!
• In Unix
– The differences between disk files and character devices, like the keyboard and teleprinter, is hidden from the application layer
• Unix provides FILE structs
– Maintained by the OS, accessed in applications through FILE* pointers – The OS transfers data through buffers associated with the structs either to character devices or to disk files
• Each program gets an array of these structs (well, it’s really a bit more complex than that)
– Defaults
» [0] – gets mapped to your console keyboard character device
» [1] – gets mapped to your console “teleprinter” character device,
as is [2]
– But you can use the shell to change these mapping to use disk files when you give the command to run the program
stdin, stdout, stderr
• FILE* variables stdin, stdout, and stderr are defined (either in stdio.h or one of the header files that it includes)
– These predefined variables reference entries [0], [1], and [2] in the OS’s collection of FILE structs for the program.
• What is a FILE struct like?
You can get some idea by looking at header files
Only Linux kernel writers would really need to know what is in a FILE struct
C++ iostream classes are just wrappers, holding some extra data along with pointers to structs supplied
by the OS; cin, cout, and cerr are iostream classes wrapping stdin, stdout, and stderr
nabg
5
Demo 1 again
• The code would be a little clearer if it made explicit use of stdin, stdout –
– Using functions fgets and fputs rather than gets and puts
• fgets – safer, you specify a limit on the number of characters so you have less chance of overrunning your character array
• fputs – doesn’t automatically add a newline (as done by puts())
Shell redirection
• You should have learnt this in CSCI114.
• The program can be made to read from a file and write to a file –
you probably had an example something like:
$ ./a.out < input1 >outputfile
•
•
Generally, in NetBeans such command line arguments can be specified in a “properties” dialog for the project – but redirection doesn’t work properly with a NetBeans/Windows/cygin configuration
So the files were moved to the banshee server
atoi and itoa
Demo1b
• Function atoi is still around – it’s in stdlib.
• Function itoa has disappeared from the standard C libraries
– An implementation of itoa was one of the examples in Kernighan and Ritchie’s book on the C language (1978)
atoi and itoa
K&R’s code has been modernised for this illustration
Demo1b
Demo1b
• Executable size (~56kbyte) is a little larger than the assembly language version (that would have been less than 512 bytes)
– There are overheads associated with high level languages
nabg
6
printf & scanf
Demo1c
Formatted I/O using scanf & printf
• Functions like fgets and fputs are OK for simple string inputs and prompts,
but they aren’t really what you’d want in a program that read lots of data and needed to produce well formatted tabular output.
• The stdio library provides some more sophisticated functions –
– scanf, printf, and variants like fprintf and sprintf
Some C++ programmers prefer to use these functions from the stdio library rather than the methods of the iostream library. The stdio library is noticeably smaller, and it tends to work a bit better with socket (networking) I/O streams. But it is less safe!
The way arguments are provided to these functions – a format string and argument variables – means that a C compiler will not spot incompatibilities (e.g. integer format specified for a float value); errors can arise at run‐time. C++’s iostream allows for more complete compile time checks.
From man page (man 3c printf)
From man page (man 3c printf)
printf
NAME
printf, fprintf, sprintf ‐ print formatted output
SYNOPSIS
#include <stdio.h>
int printf(const char *restrict format, /* args*/ ...);
int fprintf(FILE *restrict stream, const char *restrict format, /* args*/ ...);
int sprintf(char *restrict s, const char *restrict format, /* args*/ ...);
DESCRIPTION
The printf() function places output on the standard output stream stdout.
The fprintf() function places output on the named output stream stream.
The sprintf() function places output, followed by the null byte (\0), in consecutive bytes starting at s; it is the user's responsibility to ensure that enough storage is available.
From man page (man 3c printf)
Printf (& friends)
•
•
Conversion Specifications
Each conversion specification is introduced by the % character or by the character sequence %n$, after which the following appear in sequence:
– An optional field, consisting of a decimal digit string followed by a $, specifying the next argument to be converted. If this field is not provided, the args following the last argument converted will be used.
–
Zero or more flags (in any order), which modify the meaning of the conversion specification.
–
An optional minimum field width. If the converted value has fewer bytes than the field width, it will be padded with spaces by default on the left; it will be padded on the right, if the left‐adjustment flag (‐), described below, is given to the field width. The field width takes the form of an asterisk (*), described below, or a decimal integer.
–
If the conversion specifier is s, a standard conforming application interprets the field width as the minimum number of bytes to be printed…
Printf (& friends)
• Each of these functions converts, formats, and prints its arguments under control of the format. • The format is a character string. • The format is composed of zero or more directives: – ordinary characters, which are simply copied to the output stream – and conversion specifications, each of which results in the fetching of zero or more arguments. • In format strings containing the %n$ form of conversion specifications, numbered arguments in the argument list can be referenced from the format string as many times as required.
• In format strings containing the % form of conversion specifications, each argument in the argument list is used exactly once.
An easier introduction to printf
http://www.cplusplus.com/reference/cstdio/printf/
• int printf ( const char * format, ... );
• Print formatted data to stdout
• Writes the C string pointed by format to the standard output. If format
includes format specifiers (subsequences beginning with %), the additional arguments following format are formatted and inserted in the resulting string replacing their respective specifiers.
Parameters
• format C string that contains the text to be written to stdout. It can optionally contain embedded format specifiers that are replaced by the values specified in subsequent additional arguments and formatted as requested.
A format specifier follows this prototype: %[flags][width][.precision][length]specifier
• And so on for another ~eleven pages
The value returned is the total number of bytes written – it is extremely rare to see
this value used by a program; most C code treats printf as if it were void printf(…)
nabg
7
Examples of printf
Length modifiers –
int format can be used with short, standard and long int – modifier has to say which
similarly f format can be used with float and double – again modifier needed
Printf tutorials
• Lots on web – including – http://www.codingunit.com/printf‐format‐specifiers‐format‐
conversions‐and‐formatted‐output
printf
• Not really too hard
– Function call
• Address of format string, and values of arguments pushed onto stack
• Implementation of printf
– Interprets format string, fetching other arguments from stack frame
• Lots of opportunity for error –
– Number of arguments less than number of % fields – well just grab whatever garbage bits there are at top of the stack!
– % field specifier says long integer (8 byte), argument was 4 byte integer –
well just grab another 4 bytes from stack (so value of this field is wrong, as are all subsequent fields)
• Some of those format strings with width and precision can get a little confusing.
nabg
8
From man page (man 3c scanf)
scanf and friends
NAME
scanf, fscanf, sscanf‐ convert formatted input
SYNOPSIS
#include <stdio.h>
int scanf(const char *restrict format...);
int fscanf(FILE *restrict stream, const char *restrict format...);
int sscanf(const char *restrict s, const char *restrict format...);
DESCRIPTION
The scanf() function reads from the standard input stream stdin.
scanf
int scanf ( const char * format, ... );
Read formatted data from stdin
Reads data from stdin and stores them according to the parameter format into the locations pointed by the additional arguments.
The additional arguments should point to already allocated objects of the type specified by their corresponding format specifier
within the format string.
Parameters
format C string that contains a sequence of characters that control how characters extracted from the stream are treated:
Whitespace character: the function will read and ignore any whitespace characters encountered before the next non‐
whitespace character (whitespace characters include spaces, newline and tab characters). A single whitespace in the format
string validates any quantity of whitespace characters extracted from the stream (including none).
Non‐whitespace character, except format specifier (%): Any character that is not either a whitespace character (blank, newline or tab) or part of a format specifier (which begin with a % character) causes the function to read the next character from the stream, compare it to this non‐whitespace character and if it matches, it is discarded and the function continues with the next character of format. If the character does not match, the function fails, returning and leaving subsequent characters of the stream unread.
Format specifiers: A sequence formed by an initial percentage sign (%) indicates a format specifier, which is used to specify the type and format of the data to be retrieved from the stream and stored into the locations pointed by the additional arguments.
A format specifier for scanf follows this prototype:
The fscanf() function reads from the named input stream.
The sscanf() function reads from the string s.
%[*][width][length]specifier
Where the specifier character at the end is the most significant component, since it defines which characters are extracted, their interpretation and the type of its corresponding argument:
scanf
scanf
• scanf requires
– A format string
– Addresses for where the input data values are to be stored
• C convention (and C++) –
– arrays are always passed by reference (i.e. the address of [0] is used if an array name appears in an argument list
– By default, other data elements are passed by value (a copy is made on the stack)
• So –
– A char[], string, argument will just appear as its name
– All other arguments for scanf must be addresses – so use & (address of) operator with argument names in call
scanf
Return Value
Scanf ‐ examples
On success, the function returns the number of items of the argument list successfully filled. This count can match the expected number of items or be less (even zero) due to a matching failure, a reading error, or the reach of the end‐of‐file.
&yob,
&height,
&weight
‐ Using “address‐of” so as to pass addresses
of variables to scanf;
If a reading error happens or the end‐of‐file is reached while reading, the proper indicator is set (feof or ferror). And, if either happens before any data could be successfully read, EOF is returned.
but “name” is an array
so compiler knows to
pass its address
Relatively few C programs actually check the return value from scanf – they simply
assume that everything is fine and that the expected number of items were read.
nabg
9
fopen
There is more to stdio
fopen
FILE * fopen ( const char * filename, const char * mode );
Open file
Opens the file whose name is specified in the parameter filename and associates it with a stream that can be identified in future operations by the FILE pointer returned.
The operations that are allowed on the stream and how these are performed are defined by the mode parameter.
The returned stream is fully buffered by default if it is known to not refer to an interactive device.
Demo1e –
fopen – write, append, read
fopen
There is, of course, an fclose(FILE*) function that allows you to close a file that is no longer needed.
Files are closed automatically on program termination. But generally best to close files as soon
as you are finished using them.
Open for reading
fopen(filename, “w”);
Write mode – if file does exist then empty it; if file didn’t exist then create it.
fopen(filename, “a”);
Append mode – if file does exist then open it and adjust
access pointer so it’s at end of
existing content; if file didn’t exist then create it.
Close file after each stage – C doesn’t have automatic destructors so the fact that outFile
goes out of scope doesn’t have any consequence.
nabg
10
stdio
other operations on FILE*
• fseek,ftell, fread, fwrite
– These functions from stdio are useful when working with a binary file that contains data for structs
• There will be an example later – in the section on C structs.
• fflush
– Ensures that any output buffered for this file is written
• Useful in a couple of situations:
– Working with a network connection
» By default, OS may wait for a certain minimum amount of data before sending a message on the network; if your message is smaller, it just waits – so fflush it to make it go
– Tracing a program with lots of fputs(logfile, “Got to function …”) – if you don’t fflush after fputs and the program dies, your log file won’t really show how far you got
stdio
others
• int feof(FILE* stream)
– Returns a non‐zero value if “end‐of‐file” is set onf the FILE struct (which will have happened if a read operation tried to read beyond end of file)
• int ferror(FILE* stream)
– Returns a non zero value if the FILE struct has a flag set indicating that an error has occurred (e.g. conversion failure in fscanf)
• int rename(const char *old, const char *new)
– Change name of file
• int remove(const char *filename)
– Deletes the file
• tmpnam() and tmpfile() – Creating temporary files
stdlib
• It’s a bizarre mixture of assorted functions
And also in stdlib …
– atoi (also atof, atol)
– rand() and srand()
• Pseudo‐random number generator and function that sets start value
– abs()
• Absolute value
– exit(int status)
Demos 1 was supposed to cover stdlib as well as stdio
• Terminate program – returning status value to shell
– bsearch() and sort()
• C style generic functions that work with arrays of data structures,
you supply the function to compare your structs;
for bsearch(), your array must already have been sorted – it’s doing a binary search using a key value that is to be compared with key field in structs
–…
The C libraries just grew naturally as people standardized useful functions and added them to libraries.
But that has the unfortunate consequence of a lack of namespace control – function names are
arbitrary, there is no way of identifying the library to which a function belongs.
Java and C# have cleaned this up.
stdlib
• And, in addition to those, stdlib has
– malloc
– free
• These are C’s functions used to claim and release space on the heap (free store);
they are wrappers for the lower level system calls (brk())
Demos 2
C’s struct
– C++’s new and delete operators are smart wrappers for malloc and free
• Very important – examples later with C structs.
nabg
11
struct
C++
struct
• structs in C are not the same as structs in C++
– Of course, in C they cannot have any member functions – they are simply collections of data elements
C++ version compiles
and runs; the C++
compiler is happy with
our variable declarations
for “origin” and “firstquad”
• There is another difference – can illustrate with a simple struct point:
struct point {
int x;
int y;
};
C++ and C projects
struct
C
struct
C
C++ and C projects
C compiler rejects this
code!
C’s struct
• In C, the definition of point created a data type whose name is “struct point”
C’s struct
• Everywhere you want to use a variable of a struct
type in C, you will need the struct keyword
void planroute(struct point startpt , struct point endpoint)
{
struct point waypoints[1000];
…
• This gets a little tedious
• So, C programmers will usually define a typedef –
C compiler is happy
to compile this version,
and it runs ok.
nabg
typedef struct point Point;
12
C’s struct and typedef
• C compiler is happy with that version …
typedef
• All that the compiler does is a text substitution
– At an early stage in the compilation process, it changes the input text;
replacing each instance of Point with struct point.
• It’s not really defining a type
typedef unsigned int diskpos;
typedef unsigned int clockcount;
…
clockcount timer;
diskpos wantedblock;
…
val = timer + wantedblock;
• The compiler will be quite happy to let you add clockcount
values to diskpos values – because they are the same type unsigned int!
Structs and more
• This example involves a couple of contrived programs that are intended to demonstrate
Demo2b
Structs and binary files with stdio
–
–
–
–
–
–
–
struct
More scanf & printf
Some functions from <string.h>
Some limited use of malloc and free
Argument passing
Use of binary file
…
• The example involves a file of “employee records”
Projects
Two multi‐file projects more elaborate makefile and build process – but the IDE
will look after that!
NetBeans oddity
Those “Header Files” and “Source Files” folders –
they aren’t real file directories – simply a fiction
for maintaining a clean view of project
File structure is:
Sharing files between projects in NetBeans:
The projects require the same “Employee” record structure.
This was defined – in files employee.h & employee.c – in the
first project Demos2b1.
When project Demos2b2 was created, these two files were
“copied & pasted” into the new project. But they weren’t copies – they are simply references to the existing files. (It
can make things quite confusing if you really are trying to copy a
file and then change it ‐ you are actually changing the original.
It also requires a change in the #include statements.)
Netbeans authors felt that file sharing amongst projects was
common enough to justify doing it this way.
nabg
13
Demos2b1 & Demos2b2
• Demos2b1
– Read a text file with details of employees and store the data as binary records in a file.
• Demos2b2
– Reads the binary record file, and constructs an index employee‐id => disk_location
– Then a simple interactive loop where user can ask to see a particular employee’s record and possibly pay them more money (disk file gets updated)
• Not a full scale payroll application – just a little code to provide for more examples of C constructs.
C features
• Input with fscanf
– Some oddities
• Reading in a complete line that represents a string with space characters
– Normally, when reading string variables in C/C++ the reader stops at a white space character
• Noting errors on inputs
• fseek, ftell operations
• fread and fwrite for binary file
• malloc and free
Section saying “This is a C style header” – just in case
the file gets used in a C++ project (the C++ compiler has to
follow C conventions when dealing with this code).
The code uses FILE* ‐ so that’s got to be defined, so must
include stdio.h; but the compiler doesn’t have to read the
stdio.h file again if it has already included it.
Need some integer constants; but cannot have
const int NAMELENGTH = 32;
in a header file (because that defines a global
‘variable’, and if the header file is included more
than once you’ll end up with a multiply defined global)
So use old fashioned #define
Using functions from <string.h> (strcmp – string comparison function)
nabg
14
Part of data file …
Using functions (and typedef and struct data types) from time.h
time_t is a typedef variant of long (at least on this machine – the definition seems to be
in machine/types.h which gets #included from sys/_types.h which gets #included from
sys/types.h which gets #included from time.h ‐
are you still surprised why compilations can take a long time – think of all the files that must be opened and read!
struct tm represents a timestamp with fields like day, month, and year (actually year offset from 1900)
Function time() – reads system clock and fills in variable whose address is passed
Function localtime() – fills in a static data structure with the timestamp data that match time value passed as reference argument, and returns a pointer to this static area.
In fprintf statements ‐ %ld when argument is a long, %d when argument is a standard int
Demo2b1
In NetBeans, text data files can be added to a project by selecting New File/Other/empty file
and giving the file a name like empdata.txt
Opening (& later closing) files
• Open a file for binary write
• Loop
Binary file ‘b’ –
if you don’t specify binary
then some operating systems
will change the data as they
are written – e.g. a byte that
matches the ‘\n’ character may
get changed to ‘\r’ or be replaced by two bytes “\n\r”
– Zero out an Employee record
– Read in data from file, copy strings into record being careful in case they are too long
– Write the record as a binary record.
Using stdio (fscanf etc),
stdlib (exit, also NULL
via further #includes),
and string – for string
copy functions.
Then a “forever” loop where read, check, and write records. The forever loop
is exited when reach end of the text input file.
Finally, close files and return ‐
Those are all standard C library,
so are #include <…>.
Also need employee.h
from this project
Read text file
Zero out struct;
C doesn’t have
constructor like C++;
so you need to clean
out any struct you use
validate
1
2
1 – read 2 long integers, then consume anything else (trailing whitespace) on current line
2 – read a complete line into buffer – something like “Chief Executive Officer”
strncpy length limited copy, there’s only space for NAMELENGTH‐1 characters (and
a nul character) in the firstname field – so don’t try to put any more!
Note use of & ‐ address of operator –need to pass addresses of fields in person struct
nabg
15
Binary write
Demo2b1 runs
The Unix command od (octal dump) offers
a means of getting some view of the contents
of a binary file;
here it is in cygwin
Demo2b2
• Construct index (employeeid => record position)
• Interactively view and update records
• Index requires another little struct
• When including employee.h, the #include statements must make clear that it’s the one in project Demo2b1
Index
malloc
• Index
– An array of pointers to small structs
• Employee identifier
• File position as obtained by ftell()
– Create the structs as needed
• malloc
– Call – you request a block of bytes of specified size
– Return – a void* (pointer to anything) value, the address of the start of the block of bytes
» In C, this can be assigned to any pointer type
– (No check that number of employees doesn’t exceed maximum specified for array – this check left as exercise!)
nabg
16
ftell
Fill the array with NULL pointers
Open the binary file just for
reading
Read a block of the right number
of bytes into the Employee struct x
Create a new indexrec struct
in heap (equivalent to a C++
new operation, but unchecked
error prone low level style)
Fill in fields of struct
Add to index array
Use ftell() to get file byte address of start of next record.
Finding entry in index
Demos2b2
• Demos2b2
– Reads the binary record file, and constructs an index employee‐id => disk_location
– Then a simple interactive loop where user can ask to see a particular employee’s record and possibly pay them more money (disk file gets updated)
interact
• Open binary file for update
• Loop:
–
–
–
–
–
–
–
–
nabg
Prompt for and read employeeid
Find position in file using index
Allocated space for record
Read in record
Display record
Read update info
Write record
Release space used for record
Open binary file for update –
rb+
Prompt for employee id
Look up location in
index
17
fseek
Seeks to correct point in file
Allocates a heap structure for the record
Reads in the block of bytes
Display, & get update data
Write the block of bytes back
Free the block of bytes that were allocated in the heap
Could have used a local stack variable for the Employee record – just using this approach to
illustrate free()
free
Demo2b2 runs
Compilation and linkage
• Bit more complex for multi‐file programs, more convenient if you can leave it to an IDE to sort things out
– 3 gcc compile steps
• For Demos2b1/employee.c
• Demos2b2/indexrec.c
• Demos2b2/main.c
– gcc link step – employee.o, indexrec.o, main.o
nabg
Demo2c
This time use really low level I/O
Use the standard Unix system calls
unistd.h & fcntl.h
18
stdio, FILE and the file system
• stdio library with its FILE structures provides some useful features
– Formatting, string handling, buffering
• But you can simply use the underlying file system
– Access via fcntl.h and unistd.h
• These have the function signatures for the standard Unix operating system calls
unistd.h
• Wikipedia explains
“In the C and C++ programming languages, unistd.h is the name of the header file that provides access to the POSIX operating system API. It is defined by the POSIX.1 standard, the base of the Single Unix Specification, and should therefore be available in any conforming (or quasi‐conforming) operating system/compiler (all official versions of UNIX, including Mac OS X, GNU/Linux, etc.).”
• Three main groups of functions
– File structure related
– Process related
– Interprocess communications
unistd.h
unistd.h : File
NAME
unistd.h ‐ standard symbolic constants and types SYNOPSIS
#include <unistd.h>
DESCRIPTION
The <unistd.h> header defines miscellaneous symbolic constants and types, and declares miscellaneous functions. • Unix/linux systems allocate each process an array of file descriptors (one of fields in a FILE is the index of the file descriptor its using)
System calls like these were documented in the 2nd chapter of the original Unix manual –
so documentation got via man 2 cmdname.
unistd.h : File
•
creat(), link(), unlink(), dup(), close() , open()
– Open, close – obvious; open takes filename and int constant for mode
– dup – duplicate an entry in the sytems file descriptor table (gives you two file descriptors referencing same file – sometimes used in effect to renumber a I/O stream, sometimes deliberately used with different permissions)
– link() and unlink() – create and destroy 2nd‐ry entries in file directories
•
read(), write(),
•
lseek()
– Transfer blocks of bytes – less buffering than with FILE* fread/fwrite
– Move to position in file (obviously doesn’t work with character special devices like keyboard – for – You see a file descriptor just as an index number
• 0 = stdin, 1 = stdout, 2 = stderr;
• Other values – as returned by open
– System knows what they really are
• File calls
– creat(), open(), link(), unlink(), dup(), close()
• creat(), open() – these are in fcntl.h along with constant declarations used to specify ‘read‐only’, ‘write’ etc
• creat() ‐ creat() is equivalent to open() with flags equal to O_CREAT|O_WRONLY|O_TRUNC.
–
–
–
–
read(), write(),
lseek()
stat(), fstat(), iocntl()
access(), chmod(), chown(), umask()
Can work at file descriptor level
• Use open – get an integer file descriptor
• Use read/write – pass file descriptor as arguments
• Use lseek – position file
– No tell, if working at this level you are expected to keep track of file position for yourself!
keyboard only interpretation would be forget that was written (seeking back) or guess what will be written (seeking forward))
•
stat(), fstat(), •
iocntl()
•
access(), chmod(), chown(), umask()
– Ask for details about file (size etc)
– Change way device is handled etc – e.g. non‐blocking I/O
– Change properties of file in directory – you’ve used these as shell commands
nabg
19
Interact routine using system calls Interact routine using system calls Extra #includes on the Unix simulation
headers
Integer file descriptor,
obtained by open() call (mode is
read and write)
lseek
…
…
read
Remember to close your file
descriptor!
write
File descriptor table is finite – at least 16 entries, these days more – but if you don’t
invoke close on the file descriptor when you’ve finished with a file then you may run out of I/O channels.
C++
Demo3
“Call by reference”
C++ code
• For call by reference, the code generated by the C++ compiler • You have been taught C++’s version of call by reference
In C, you make these details explicit!
• Pass by reference, C style
– for the function call pushes the addresses of the variables ‘a’ and ‘b’ onto the stack – The code generated for the function knows that the reference arguments double& arg1, and double& arg2 are pointers
• Code for temp = arg1; is really something like – mov arg1,r0
– mov (r0), temp
• It de‐references the pointers to get at the data value
nabg
20
C style pass by reference
• You
– Define the function with pointer data types as arguments – double* arg1
– Code the function using explicit de‐referencing of the pointers when you want to get at the values of the arguments
– Use the & (address of operator) to get the addresses needed as arguments in the call to the function – swap(&a, &b)
It’s very much like the PDP‐11 code isn’t it!
• The C code here is very close to the assembly language code that you would have written for the PDP‐11!
• Back in ancient times before C became the favoured language for systems programming, proponents of other (now dead) languages used to deride C as “that high level assembly language for the PDP‐11”.
Extra libraries
• A basic C installation will have the libraries for stdio, string, stdlib
etc
Linking with libraries
• But if you want things like images (a library of code that lets you generate a picture as a png file), you must add extra libraries
– You will typically have to use Google a bit to identify appropriate libraries
– You install these libraries using applications like the cygwin setup exe, or the synaptic package manager on Ubuntu, or apt‐get install
• A library consists of
– The header file with the declarations of the functions provided and data types used etc
– One or both of
• A statically linked library
• A dynamically linked shared library
(The differences will be explained later)
Libraries
• The gcc utility requires
– At compile time: the include header files
– At link time: details of extra libraries that your code uses.
• Your compilation environment will be automatically configured so that the most commonly used sets of header files, and the corresponding library files, are in known locations
– The compiler and linker will find them automatically
Where do you find the include files and libraries?
• When you install a library package, it will most likely end up in one of two places ‐ /usr or /usr/local
– The header file(s) will go in /usr/include or /usr/local/include
– The library code will go in /usr/lib or /usr/local/lib
• Less standard libraries? – You will have to provide details of where the header files can be found and what libraries must be used
– These details go in the project’s “makefile”
• It’s easier when using an IDE like NetBeans or Eclipse – it knows the makefile format and will help you enter the details
nabg
21
Example –
Using the gd graphics library
The makefile
• The makefile will need extra lines that direct the build process so that – The additional header files can be found by the compiler,
– The required libraries are scanned by the linker¶
•
“The GD Graphics Library is a graphics software library by Thomas Boutell and others for dynamically manipulating images. Its native programming language is ANSI C, but it has interfaces for many other programming languages. It can create GIFs, JPEGs, PNGs, and WBMPs. Support for drawing GIFs was dropped in 1999 when Unisys revoked the royalty‐free license granted to non‐commercial software projects for the LZW compression method used by GIFs. When the Unisys patent expired worldwide on July 7, 2004, GIF support was subsequently re‐enabled.
•
GD originally stood for "GIF Draw". However, since the revoking of the Unisys license, it has informally stood for "Graphics Draw".
•
GD can create images composed of lines, arcs, text (using program‐selected fonts), other images, and multiple colors. Version 2.0 adds support for truecolor images, alpha channels, resampling (for smooth resizing of truecolor images), and many other features.
•
GD supports numerous programming languages including C, PHP, Perl, Python, OCaml, Tcl, Lua, Pascal, GNU Octave, REXX, Ruby and Go.”
• This is sometimes problematic!
– The functions in one library may depend on functions in some other library
» You need to identify all the libraries required
» You may need to list them in a specific order
• If your code explicitly uses functions in library‐A but not library‐B (whose functions are called indirectly through functions in library‐
A) you will probably have to list library‐A first in the link process
¶The task of the linker will be explained in more detail later!
GD smiley face project
• The GD library is documented at http://www.boutell.com/gd/manual2.0.33.html#index
– The library has one header file – gd.h
– For linking, the documentation states that you must link with •
•
•
•
•
•
gd, a library with code for png images, a library for data compression, another image library (for jpeg images), a library for handling fonts (character strings in any generated image), and the maths library
– These libraries must be identified to the linker
NetBeans
Add headers and libraries
You will meet GD if you take subjects like CSCI110 or CSCI399 – it’s used in PHP
NetBeans
• Example is a little NetBeans project
– NetBeans uses a “Properties” dialog that simplifies the task of editing the makefile to add include header file collections and link libraries
• Right click on the project and select Properties
– There are subsections for the compiler
» Specify things like header file collections
– And for the linker
» List the libraries
• In my install, the gd library header was in /usr/include and the libraries in /usr/lib
– Since the headers were in the standard location, I didn’t need to specify anything
– I had to list the libraries
NetBeans
Add headers and libraries
Didn’t need to add any extra include directories –
would have had to if library installed in /usr/local or
somewhere else
Needed to specify 6 extra libraries for the linker
nabg
22
Project
• The gd headers must be included
– Right‐click on the include line lets you view the header and see the functions that the gd library offers
Code using GD
• Programs using GD have a typical pattern
1. Allocate an image
•
2.
3.
4.
5.
An image is a 2d array of 4‐byte integers – alpha, red, green, blue
Create some colours
Use drawing primitives such as gdImageLine()
Open a (binary) output file
Use a “codec” (coder‐decoder) to write the image array in a chosen standard image format – such as gif, jpeg, png
Build and run
• The output from the build step shows the make process with the linker scanning all those libraries
gcc
-c -g -MMD -MP -MF build/Debug/Cygwin_4.x-Windows/main.o.d -o build/Debug/Cygwin_4.xWindows/main.o main.c
mkdir -p dist/Debug/Cygwin_4.x-Windows
gcc
-o dist/Debug/Cygwin_4.x-Windows/smileyface build/Debug/Cygwin_4.x-Windows/main.o
/cygdrive/C/cygwin/lib/libgd.dll.a /cygdrive/C/cygwin/lib/libpng.dll.a
/cygdrive/C/cygwin/lib/libjpeg.dll.a /cygdrive/C/cygwin/lib/libz.dll.a
/cygdrive/C/cygwin/lib/libfreetype.dll.a /cygdrive/C/cygwin/lib/libm.a
• It runs
nabg
What about all those missing C++ features …
Exceptions
Templates
Classes
23
Exceptions
C
• I think that these are left to CSCI204
• C has a mechanism to achieve something like this based on functions setjmp and longjmp
– Idea is you write some code like
try {
// lots of code calling functions –
…
}
catch(sometype varname) {
// deal with anything that went wrong, details
// are in variable named varname of type sometype
…
}
– The point being that the called functions cannot deal with the problem, but the code here can
• E.g. 1.
2.
3.
4.
Your called code has to open a named file
Deep down the function call chain, the file is found not to exist,
The called code cannot do anything about a non‐existent file, so it throws an exception
The code here might be able to arrange to ask the user for a different file name and then repeat the action with the correctly named file
– You use setjmp to create a special data structure (that basically records the current stack pointer), and pass this data structure as a reference argument down through the function call sequence
– If something goes wrong, you use the longjmp function to interpret the data structure and get you back to the point where it was created.
• C’s setjmp/longjmp isn’t as tidy as a C++ exception
– It simply smashes the stack, C++ should carefully unwind the stack firing any destructors appropriate in the functions that are being exited
Templates
C++ template class
• Again, you may not be covering these in the 100‐level C++ subjects
• Templates exist to allow you to write “generic” code –
code that can apply to a variety of different data types.
• There is a complete example at cplusplus.com
– One of your first examples of templates is likely to be a C++ linked list class that can be used to hold data items
• What is the type of a “data item”?
– Well, it could be anything – string, instance of some struct that you’ve defined (e.g. struct employeerec), instance of another class.
The linked list part of the code doesn’t care.
Each listnode contains a pointer (to the
next listnode) and a data value.
The template structure allows you to
leave the type of the data value unspecified.
• So in C++, you use a “template type”
C++ template class
Using a C++ template
The template List
class, that uses the
template ListNode
class is also given at
nabg
Code that simply builds a list where the data values are ints
24
And in C?
• It’s not possible to achieve exactly the same effect –
but you can easily define something like a “list of data items” – where the type of a data item is flexible.
• Of course you can create “classes” in C
• In C, you use void*
• It’s just it’s more work for the programmer – work that the C++ compiler automates
– void* ‐ pointer to any kind of struct created in the heap
• So you can easily have a C “list class” where each list node has a “pointer to list node” link, and a “pointer to any data struct” data
nabg
A C “list class”???
– The first implementation of C++ used a pre‐compiler called Cfront that converted the C++ code into standard C that could be compiled with existing compilers
– The C++ compiler also applies checks and restrictions that eliminate some of the errors that can be made by a programmer constructing a “class in C”.
25
© Copyright 2026 Paperzz