Parsing Life: 2011

Safely extracting possible NULLs

The following occurred to me while thinking about a way for storing NULLs safely. It's just a quick-shot, so it's heavily debatable, and what I'm aiming at is really something that is supported by the language, or at least by a very short language construct - avoiding the clumsy getter/setter semantics. I just couldn't do it quickly in C++...

#include<iostream>
#include<exception>
using namespace std;

class IntMaybeNull {
private:
int *thePointer;
bool hasBeenAsked;
public:
IntMaybeNull(int *aPointer) : thePointer(aPointer), hasBeenAsked(false) {}
void setPointer(int *newPointer) {
thePointer = newPointer;
hasBeenAsked = false;
}
bool ask() {
hasBeenAsked = true;
return (thePointer != NULL);
}
int *getPointer() const {
if (hasBeenAsked && thePointer != NULL)
return thePointer;
else
throw exception();
}
};

int main() {
int x = 20;
IntMaybeNull i(&x);
try {
int *ii = i.getPointer();
} catch(exception e) {
cout << "dereferencing without asking - buuuh!" << endl;
}

try {
if (i.ask()) {
int *ii = i.getPointer();
cout << *ii << endl;
}
} catch(exception e) {
cout << "dereferencing without asking - buuuh!" << endl;
}
return 0;
}

Regular Expressions Should Be Handled At Compile Time!

So, I talked with a friend about what our ideal programming language might look like.

One of the points that he raised was that regular expressions, and some similar constructs, should be known and compiled/handled by the compiler itself, so an error in a regex can be a compile-time error.

That is a GREAT idea!

Our first thought was that regular expressions, and a few other such constructs, should be a language feature. However, language features are really the last resort - they're limited to what the language designer thought of, they cannot easily be extended, and they're a "special case" compared to normal functions. While in the background, they really do nothing else than normal functions do (in many cases at least).

We came up with an idea for an additional compiler phase that is accessible to a library via some sort of hook. What's needed is something like the following:

compileTimeImport com.superduper.regex(@NewIsImplied);
...
List list = regex{"^\w+$"}.match("thisshouldmatch");

The curly braces would tell the compiler that we are trying to use a compiler-phase feature, and that it needs to check all parameters for being constant.

That way, if the compiler encounters something like

List list = regex{"^[a-+$"}.match("thisshouldmatch");

it could complain, much like perl does:

$ perl -e '$x =~ /^[a-+$/;'
Invalid [] range "a-+" in regex; marked by <-- HERE in m/^[a-+ <-- HERE $/ at -e line 1.

Of course, you couldn't use that for non-constant expressions, which is a sad and severe limitation. For that case, we might use square brackets instead, like this:

String matchString = ...whatever expression...;

List list = regex["^\w+"+matchString+"$"].match("thisshouldmatch");

So, why the curly or square brackets?

First, they imply a "new" operator, and thereby help keep the code short.
Second, I want to make sure that the compiler understands me - if I use curly brackets, but then use a non-const expression, then I want it to throw an error. Of course, the compiler might just be able to figure it out, but I don't want to inadvertently use the wrong form.

Should we decide that we really want to go for the automatic decision here, we could just allow for the use of normal brackets:

String matchString = ...whatever expression...;

List list = regex("^\w+"+matchString+"$").match("thisshouldmatch");

I think those "little things" might help make programming a much more enjoyable waste of time.

The Name of The Blog

This blog owes its existence to a friendly evening spent consuming beer & falafel with renowned coder and blogger supreme, Adrian Smith, whose blog "Databases&Life" has inspired many a fun debate. You can visit his blog @ http://www.databasesandlife.com/.

So I thought it fitting to use a similar naming scheme for my own IT-related blog. And since I like parsing, and all human life can in some way be considered the parsing of said life, "Parsing Life" is the name I came up with.

Which said blog I hereby officially (and somewhat proudly) introduce to the public.

I am pretty convinced at this point that this will never be high-frequency. I'll just post whenever I feel like it. Probably once a week, or more probably once a month.

After all, A Blog Is A Blog Is A Blog...

Donnerstag, 20. Oktober 2011

Safely extracting possible NULLs

Regular Expressions Should Be Handled At Compile Time!

The Name of The Blog