Framework mapping

There is only one command in CARTHAGENE to perform initial framework mapping but it is quite powerful. It is the buildfw command and it may also perform comprehensive mapping. The command automatically select a subset of the current set of active markers and orders them. The aim is to build a map such that no alternative order of the selected markers has a loglikelihood within a given threshold of the loglikelihood of the map.

The procedure used to tackle this difficult problem is heuristics and is inspired from similar functionalities in RHMAP [LBLC95] and CRIMAP [Gre88]. It uses an incremental insertion method, trying to insert available markers in a current map. For a given marker that could be inserted, it tries to insert the marker at all possible position of the map. The difference in loglikelihood between the best insertion and the second best insertion is used to qualify the marker. The marker inserted in practice is the marker that maximizes this difference and such that this difference is larger than a first threshold (called the ``Adding Threshold''). The marker is inserted at its optimal position. If there are orders whose difference in loglikelihood with this best position is less than a second threshold (called the ``Keep Threshold''), they are also kept as possible new starting points for the next marker to be inserted. This threshold must naturally be equal to or larger than the adding threshold.

The build process may start from an empty map or from an existing order (in this case, it must contain at least 3 markers). When the starting map is empty, all triplets are considered as candidates and only the best triplet according to the difference in loglikelihood between the best order and the second best order is used. Otherwise, the initial order specified is used directly.

After a map is built, using this process, some postprocessing can be applied optionally: all remaining non insertable markers are inserted one by one, in all possible positions in the final map and then removed. For each such marker, the command reports how the loglikelihood changes wrt to the best insertion point. The best position is marked with a ``+''. The difference in loglikelihood between the best position and each position are printed out if this difference is less than a threshold.

It is important to note that whatever the threshold you specify, this procedure may easily be fooled and the final map produced will not really be true framework map (i.e., a map such that all alternative orders have a difference in loglikelihood with the initial map larger than the ``adding threshold''). This is not a weakness of CARTHAGENE but a weakness of all simple heuristics procedure. To be more confident about the quality of your maps, it is strongly advised to apply improving commands to it and analyse the heap to check that no close alternative order appears in it.

Alternatively, there are other framework mapping commands frameworkmst and framework which ensure a guaranteed minimum distance between the framework map and any other map with the same markers in (normalized) 2-point likelihood.

Thomas Schiex 2009-10-27