Wednesday, November 23, 2005

Revisiting symbols

I'm taking yet another look at the bjv2 biological symbols model. This time I've sliced it differently yet again. Ambiguity represents things that are potentially ambiguous. Symbol represents things that are not ambiguous. Token is a super-interface of both. Alphabets are over Symbols and have methods to obtain Ambiguities for symbols. A TokenBuffer contains tokens and is specialised for AmbiguityBuffer and SymbolBuffer. Similarly, the io tokenization stuff works with Tokens, and it may be necisary to specialise this for ambiguities and symbols.

I hope that this particular way of slicing things will give a better level of control over the symbols, their stoorage and extra API that particular symbol classes expose. Also, I'm hoping to be able to retro-fit basis symbols and use annotations to code-generate mapping functions to extract components of the basis symbols with full type-safety. For example, I want to be able to have an alignment with getQuery(), getSource() and getState(), where we know the classes of the tree components, and can extract a TokenBuffer for just one component of this e.g. the state, still with type-safety.

So - now the API is roughly in place, I need to write some apps that use this, fill in the booring implementation code and see how easy/prety it is to use.


Post a Comment

<< Home