Why C is Almost Always the Wrong Choice

C has no true string data type.

The common arguments defending this as a feature rather than a shortcoming go something like this:

  • Performance. The argument here is that statically-allocated, null-terminated char arrays are faster than accessing the heap, and by forcing the programmer to manage his own memory, huge performance gains will result.
  • Portability. This one goes along the lines that introducing a string type could introduce portability problems, as the semantics of such a type could be wildly different from architecture to architecture.
  • Closeness to the machine. C is intended to be as “close to the machine” as possible, providing minimal abstraction: since the machine has no concept of a string, neither should C.

If these arguments are true, then we shouldn’t be using C for more than a tiny fraction of what it is being used for today. The reality of these arguments is more like this:

  • Performance: I’m a programmer of great hubris who actually believes that I can reinvent the manual memory management wheel better than the last million programmers before me (especially those snooty implementers of high-level languages), and I think that demonstrating my use of pointers, malloc(), gdb, and valgrind makes me look cooler than you.
  • Portability: I’m actually daft enough to think that the unintelligible spaghetti of preprocessor macros in this project constitutes some example of elegant, portable code, and that such things make me look cooler than you.
  • Closeness to the machine: I’ve never actually developed anything that runs in ring zero, but using the same language that Linus Torvalds does makes me look cooler than you.

The technical debt this attitude has incurred is tremendous: nearly every application that gets a steady stream of security vulnerability patches is written in C, and the vast majority of them are buffer overflow exploits made possible by bugs in manual memory management code. How many times has bind or sendmail been patched for these problems?

The truth is that most software will work better and run faster with the dynamic memory management provided by high-level language runtimes: the best algorithms for most common cases are well-known and have been implemented better than most programmers could ever do. For most corner cases, writing a shared library in C and linking it into your application (written in a high-level language) is a better choice than going all-in on C. This provides isolation of unsafe code, and results in the majority of your application being easier to read, and easier for open-source developers to contribute to. And most applications won’t even need any C code at all. Let’s face it: the majority of us are not writing kernels, database management systems, compilers, or graphics-intensive code (and in the latter case, C’s strengths are very much debatable).

The long and short of it is that most software today is I/O-bound and not CPU-bound: almost every single one of the old network services (DNS servers, mail servers, IRC servers, http servers, etc.) stand to gain absolutely nothing from being implemented in C, and should be written in high-level languages so that they can benefit from run-time bounds checking, type checking, and leak-free memory management.

Can I put out a CVE on this?

Advertisements

OOF – A New Programming Metaphor

NEW PROGRAMMING METAPHOR

The “OOF” or “Object-Oriented Fail” family of languages is intended to deliver reliable, deterministic, 100% failure rates to public institutions seeking to implement social programs to increase nationwide stress, eliminate all disposable income of and decrease the average quality-of-life for all members of the middle class.

INHERITANCE MODEL

In the past, OO languages have been forced to choose between single- and multiple-inheritance models. OOF languages, however, dispense with this model by introducing the concept of “half inheritance”: derived classes will inherit precisely half of the properties of their base classes, selected by a pseudorandom number generator at runtime. This ensures that no defined class can be represented as a directed acyclic graph or UML model. By choosing the revolutionary half-inheritance model, all “design-up-front” development methodologies will be rendered instantly obsolete.

MEMORY MANAGEMENT

Instead of employing memory management, OOF languages must, in the initialization routine of the garbage injector, use operating system calls to allocate and pre-initialize the largest block of free heap available. Access to objects and variables must be aligned on “n”-byte boundaries, where “n” represents the current day of the month. This will provide consistency in the number of available variables and/or objects in a given program on a given day. Programmers should take care to design their usage of variables and objects with the lowest day-number of any given month in mind. All access to heap memory is through pointers in the code segment, which are to be managed by the garbage injector.

GARBAGE INJECTOR

The garbage injector’s prime responsibility is to modify the addresses of all such aforementioned pointers using the same runtime pseudorandom number generator as used by the half-inheritance system. Data pointed to will be copied from its original location to its new location in ascending order. No mechanism is provided to examine the usage of the new location prior to overwriting its data with the result of the pending copy operation. This mandatory operation will run during idle CPU cycles, and result in a consistently inconsistent execution environment for all system and user programs.

GARBAGE COLLECTOR

The garbage collector will address the inconsistencies of the OOF program’s execution environment, as ensured by the garbage injector. Its operation is simple: any access to variables or objects will trigger a call into system BIOS routines which will reboot the entire computer, thus eliminating any inconsistencies in the program’s state, and indeed, the program’s ability to store and retrieve state at all.

IMPLEMENTATION GUIDELINES

We recommend that the runtime library for any OOF language be web-enabled in such a manner as to crash or otherwise disable the end user’s web user agent when any request is made for an OOF program or subroutine. Ideally, this disabling mechanism should also initiate a low-level format of the user’s fixed storage devices. It is recommended that any government implementation of an OOF system be paired with extensive subsidies for large computer repair establishments, such as The Geek Squad. This will provide  maximal economic benefit to all.

What an object-oriented MUMPS could look like, without breaking existing code

A little idiomatic support for the OO paradigm is something I’d love to see MUMPS embrace in the future. This is not to say that you can’t write OO code in MUMPS ’95–you can even write OO code in assembly language–but, the syntactic sugar would be nice, as well as the scoping protection and encapsulation this would give. The introduction of the dot-notation paradigm would lend itself very well to creating clean new API libraries. Some of you may not know, but the dot-notation API is used by some MUMPS vendors for calling out to external (non-MUMPS) routines.

We’d add a few new things:

  1. Overloading the NEW command so that it can also be a unary operator. This will allow operations such as:
    set myCar=new $$Car()
  2. Adding a $OBJECT() intrinsic, which would take two parameters: a variable name and a type name.
    $OBJECT(variable,type)

    would return a true value if the node referenced by variable (which can be a global variable or a local, either could include subscripts) matches the type defined by type.

  3. Adding a this keyword, which would return the entire current object.
  4. Adding conventions to the language. Any NEW commands at the top level of a routine would define instance variables, and the top-level routine without any tag specified becomes the default constructor for an object.

The main thing I’d like to see, though, is the ability (shown in the hypothetical sample below) to store entire object instances in MUMPS globals and retrieve them later:

;; when tag and routine are same, use the default constructor
Car
 new color,doors,started 
 set color="",doors=0,started=0
 quit
;; overloaded constructor (limitation: 
;; you could only overload the constructor once)
Car.Car(color,doors) 
 set this.color=color,this.doors=doors
 quit this
;; procedure method
Car.Start 
 set this.started=1
 write "Car started",!
 quit
;; procedure method
Car.Stop 
 set this.started=0
 write "Car stopped",!
 quit
;; extrinsic function method
Car.Color() 
 quit this.color
;; extrinsic function method
Car.Doors() 
 quit this.doors
;; extrinsic function method
Car.Started() 
 quit this.started

TestCar
 ;; new as a unary operator as well as a command
 new wifeCar
 set wifeCar=new $$Car("blue",5)
 new myCar
 set myCar=new $$Car("silver",4)
 ;; we're now going to store myCar and wifeCar in subscripts 
 ;; of the ^persistentCars global
 set ^persistentCars("John")=myCar 
 set ^persistentCars("Miri")=wifeCar
 ;; $object(variable,object) would return null if "variable"
 ;; is not of type "object"
 new memCar
 if $object(^persistentCars("Miri"),Car) do 
 . ; this use of the new operator will fail 
 . ; if ^persistentCars("Miri") is not a Car
 . set memCar=new $object(^persistentCars("Miri"),Car) 
 . ; we can now access Car's methods from memCar
 . do memCar.Start 
 . if $$memCar.Doors()>2 write "family car!",! else write "coupe!",!
 . set ^otherPersistentCars("Miri")=memCar
 else write "The global did not contain a valid Car",!
 quit