2 posts tagged “cellml”
Today was great fun - lots of presentations and lots of lively discussions, of which we were all a part, but which Nicolas Le Novère ("shown" left, courtesy of Falko Krause :) ) also enjoyed.
Here are the notes!
CellML: Catherine Lloyd
Most of the talk aligned with the talk Catherine gave at BioSysBio 2009 this past week. Some parts were new, however. For instance, she seemed to spend a little more time on versioning. A version is an update of a model entry - usually with a traceable model history. A variant is a slightly different model from the same reference. A variant could be the same model adapted for adifferent cell type. Alternatively, variants of a model may be created to reproduce the different figures from a publication.
libAnnotationSBML: Neil Swainston
Automatic Linking of MIRIAM Annotation to a model using web services. He was involved with the creation of the SBML metabolic yeast network, which had MIRIAM annotations. And now that this qualitative information has been published, they're doing some experiments to get quantitative data. They developed a simple CellDesigner plugin as proof-of-concept to allow the linking of a model to their quantitative data repository (not finished yet).
MIRIAM annotations are a form of tagging the model. However, they want to do more: use the annotations to "reason" over the model. By "reason", they mean doing more than just seeing if the model is annotated: but seeing if the model is being annotated well. Do the reactions balance? Such a question cannot solely be answered by libSBML, and they can use ChEBI to do this. As a human, you would go to the ChEBI entry and get the formula from ChEBI. Then, you can compare that to your reaction. Can this be done automatically?
libAnnotationSBML connects to ChEBI, KEGG, UniProt, MIRIAM. This information is presented in a single convenience class. This stuff has a "SBML Reaction Balance Analyser". They don't do any automatic corrections, but they can identify where something doesn't match with ChEBI. Would like to do it automatically in the near future. Would also like to suggest corrections to existing models (incorrect annotations, missing reactants / products, stoichiometry). Would like to intelligently generate models.
Future: support more web services, write it in C++, or perhaps ask the MIRIAM people to have a web service method that retrieves the URL for the wsdl as well as the human-readable URL. However, connections to web services tend to be inconsistent, and therefore you can't always get the information you want.
semanticSBML: Falko Krause
You can find more information here: http://sysbio.molgen.mpg.de/semanticsbml/. Here there is a standalone GUI which is capable of offline annotation. There is also a web interface.
This is in fact a much more interesting application than is suggested by the notes - mainly I was preoccupied with making sure my talk was ready to go, as it was almost my turn. I highly recommend that you have a look at the link above and have a play with this software.
Saint
I didn't speak directly about Saint, as I will be speaking about MFO instead this afternoon. However, as model annotation was being talked about today, I thought it might be useful for me to put up some information about Saint. The presentation and video will be up on the IET website (but isn't yet). In the meantime, here's a rundown of the purpose of Saint.
The creation of accurate quantitative Systems Biology Markup Language (SBML) models is a time-intensive manual process. Modellers need to know and understand both the systems they are modelling and the intricacies of SBML. However, the amount of relevant data for even a relatively small and well-scoped model is overwhelming. Saint, an automated SBML annotation integration environment, aims to aid the modeller and reduce development time by providing extra information about any given SBML model in an easy-to-use interface. Saint accepts SBML-formatted files and integrates information from multiple databases automatically. Any new information that the user agrees with is then automatically added to the SBML model.
The initial functionality of Saint allows the annotation of already-extant species and suggests additional interactions. The user uploads their SBML model, and the portions of the model recognized by Saint are then displayed using a tabular structure. The user can then remove any items they are not interested in annotating. For instance, some terms such as "sink" are modelling artefacts and do not correspond to genes or proteins. Therefore, the user would normally wish to delete this from the search space to prevent any possible matches with actual biological species of a similar name. Once the user is satisfied with the list of items to be annotated, the model is submitted using the "Annotate Listed Items" button at the bottom of the table. A summary of the annotation returned by Saint is then added to the main table. The user can then remove any new annotation that is unsuitable for their model. At any stage, the user may click on the "Annotated Model" tab in Saint, which adds all new annotation to the original model and presents the new SBML model for viewing and download.
While there are a number of tools available for manipulating and validating SBML (e.g. LibSBML), simulating SBML models (e.g. BASIS and the SBML Toolbox ), and analysing simulations (e.g. COPASI,), and running modelling workflows (e.g. Taverna ), Saint is the first to provide basic automatic annotation of SBML models in an easy-to-use GUI. The purpose of Saint is to aid the researcher in the difficult task of information discovery by seamlessly querying multiple databases and providing the results of that query within the SBML model itself. By providing a modelling interface to existing data integration resources and, modellers are able to add valuable information to models quickly and simply.
Saint already generates reactions and associated new species and species references. It is being extended this creation of reactions to also generate skeleton models based around a species or pathway of interest.
SBO: Nick Juty
The sourceforge website has a tracker as well as access to the whole project. You can browse the whole tree from http://www.ebi.ac.uk/sbo. Your search retrieves a series of tables, and they will retrieve obsolete terms so that you can tell what used to be there. The main curation works happens via a web interface that directly talks to the database (this is just for curation). Lots of web services available.
From SBML to SBGN through SBO: Alice Villeger
Semantic annotations as a bridge between standards. Showed a very nice modification to the SBGN reference card where she colored sections by their SBO branch, which then showed up areas where different branches were used for the same type of notation (and therefore were candidates for modification within SBO). She showed that the SBML info needed is in Species Reference => this can be solved by changing the current SBGN specs. Further, there are some SBO terms that have no direct SBML equivalent (e.g. or, and). She gave a number of other examples, too.
It also seems that the compartment in SBGN and the SBML specification don't match. This is because the SBML compartment is not intended to be the same as the SBGN compartment (a functional versus a physical compartment).
Her analysis of the alignment of SBGN and SBO showed up a number of inconsistencies. This was really useful. There should be some machine-readable expression of SBML x SBO and SBGN x SBO. Further, there aren't many models annotated with SBO yet. And, if they are, they are not always sufficiently precise. One solution could be a MIRIAM to SBO converter program.
http://arcadiapathways.sourceforge.net
http://biomodels.net/meetings/2009/index.html
Please note that this post is merely my notes on the presentation. They are not guaranteed to be correct, and unless explicitly stated are not my opinions. They do not reflect the opinions of my employers. Any errors you can happily assume to be mine and no-one else's. I'm happy to correct any errors you may spot - just let me know!
Catherine Lloyd et al.
Auckland Bioengineering Institute
CellML is an XML-based markup language which leverage off existing standards (e.g. MathML and RDF). Why is a standard format needed at all? The answer lies in the publishing process. A modeller starts out writing the model in whatever language they want, but then when others want to access the model from a publication, how can they run it or understand it? Also, the writing out of the model as a series of equations or graphics can introduce the possibility of errors. Why not just publish in MATLAB? Why bother putting it in CellML? Well, MATLAB isn't used by everyone. And where it is used, it's a procedural language and distinct from the published paper, which has nothing procedural.
Although they have best-practice standards, there are no requirements. This flexible structure can be used to describe a wide range of types of models: electrophysiology, immunology, cell cycle, muscular contraction, synthetic biology and more. There are some limitations: CellML is good at describing at the molecular and cellular model, but not so good at tissue-scale. However, work is underway on this cross-scale modelling.
CellML is modular structure allowing models to be broken into components. CellML has an import feature that allows you to stick bits of models together, like lego bricks. SBML doesn't have this yet, though it is planned for future versions. This import feature is really useful, and saves time. In CellML models can share entities (e.g. proteins) and processes (e.g. reactions) between models. Imports are also helpful for models with repeating units. For a cell/pacemaker model, a pacemaker unit can be defined once and imported many times.
They have two tools (PCEnv and COR) to help develop CellML models. PCEnv allows development in CellML and then export in other formats such as MATLAB, C, Python etc. PCEnv is windows/linux/mac, COR is windows only. Both tools can also run simulations. PCEnv also shows embedded SVG diagrams of all the models in the repository.
The CellML Model Repository: http://www.cellml.org/models
This repository has over 380 models, all are free for download. The majority are from published paper. For each model entry, there is a short description, curation status, a schematic diagram. Model curation includes model validation and documentation. Of the 380 models, only 4 have been translated straight from the published paper into a working CellML model (i.e. without help from the curation team first). This is because there are often typographical errors in the paper, a lack of unit definitions, missing parameters, missing initial conditions, missing equations etc. At the moment they have a star system. 0 = not curated yet. 1 = maths consistent with published paper. 2 = model's complete and reproduced the results in the published paper. 3 = model satisfies physical constraints, e.g. conservation of mass, momentum, charge etc. Other problems: for some older models we never have access to original code.
There's lots of collaboration with SBML. Currently the diagrams are made manually, and there's no reason why it can't be done automatically, and that's being worked on now. If we want to encourage (via journals) modellers to put their models into SBML or CellML, we need to provide really nice tools and help making the models.
Tuesday Session 2
http://friendfeed.com/rooms/biosysbio
http://conferences.theiet.org/biosysbio
Please note that this post is merely my notes on the presentation. They are not guaranteed to be correct, and unless explicitly stated are not my opinions. They do not reflect the opinions of my employers. Any errors you can happily assume to be mine and no-one else's. I'm happy to correct any errors you may spot - just let me know!