Aperio - Solvent Distributions

Back to User:Aperio

When I joined the Genetics program in Late T3, many fine products were available. There was a great flax seed (Tedra's Folly), some great wines (Eigam, I think), and a still active genetics expert, Ariella, who was heavily invested in flowers and mutagens. I was able to pick her brain quite a bit and the others to a limited degree. In doing so, I took MANY, MANY missteps but materials were plentiful and the Tale had slowed to a crawl so there was a high tolerance for failure since nothing else was happening. I also provided value as a sounding board and in making productive and effective use of materials long since discarded as useless. This page is to elaborate on this technique and perhaps integrate with other T3 Genetics Pages if it's deemed still of value.

The prevailing wisdom has always been use the latest and greatest and in Genetics Research this meant Crystal and Diamond Solvent. These solvents are miraculous in the depth of information they provide in a single "try"; offering 6 and 7 bits out of a genome that might be only twice that length (though 4x is more common across the board). However, all that comes at a cost... the mushrooms used are the rarest in the game and you get double jeopardy in that the quantity produced was a fraction (1/7) of what was produced by the same quantity of common mushrooms (now 2x and 3x in T5, respectively) and same quantity of supporting materials. The basic solvents; Milky, Clear, and Glass; are cheaper but provide negligible information per "try"; offering 3/4/5 bits. Ultimately, Genetics Research stagnated into "what can I do with 5 bits?" with the Glass being the most cost-effective choice and "what genetics question can best be answered by investing more precious solvents".

It turns out that there is another approach based upon the simple premise that ATITD's much-maligned random generator will, with sufficient repetition, provide a normal distribution of results. By applying an enormous volume of tests with the cheapest solvents, you can eliminate the randomness of the research "tries" and infer relationships between the "tries" based upon volume. At the risk of getting ahead of myself, my technique compares the quantity of each combination against each other unique combination and reveals the ratio of those occurrences within the genome. With sufficient volume, this technique will tell you that certain combinations (which will overlap) occur in integral multiples when compared to other baseline commbinations; specifically your endpoints which will occur only once. I then feed the combinations into a simple application I wrote (but it's easy enough to illustrate by hand) and computes the genome. I'll migrate my T3 research here when time permits, but in the meantime I'll make up a simple example to illuatrate (assumes you've read the primers on the wiki).

Each bit (in Vines) is represented by GORY and K (the endpoints). Each genome consists of some combination of these bits. Each solvent will return the appropriate number of bits IN ORDER from whatever section of the genome it randomly selects.

You might get KGO, KGOR, KGORY, KGORYY, KGORYYY from Milky, Clear, Glass, Crystal, and Diamond; respectively; on a hypothetical genome that happens to catch the left-endpoint.

If we apply Diamond Solvent to a fictional genome KGORYYYYK, there are only 3 possibilities: KGO RYY Y, GOR YYY Y and ORY YYY K (separations are my convention for readability). Easy-peasey. We would have a pretty good idea that these 3 strings fit directly together with 5 or 6 matches of 7. With long genomes in excess of 30 bits, you would have to try dozens of times to reliably get two 6-match strings randomly and even then you could not account for a situation where that combination existed multiple times (there is an actual vine that has a dozen or more consecutive Gs and you would NEVER be able to determine how many Gs were there directly).

However, if we apply my technique to the same fictional genome KGORYYYYK, there are many possibilities; KGO, GOR, ORY, RYY, YYY, YYY, YYK. Yes, I duplicated YYY because there are TWO discrete occurrences and you'll find that with a sufficient volume of "tries", this combination will appear twice as often as any other combination that occurs only once. While Crystal and Diamond could directly reveal this situation IF YOU got lucky enough to hit it, Glass or less would NEVER directly show it (you'd never know how many Ys were intervening. However, by analyzing the proportions, you can guess. And you can apply this technique to arbitrarily large genomes and infer relationships you could never afford to analyze with the premium solvents.

In short, the premium solvents will give you a general idea of what's in a genome and will reveal some specific smaller configurations that lesser solvents will not, including how they actually fit together. However, sufficient volume testing with a cheaper solvent will tell you the proportions between combinations and by extension the ACTUAL genome length.

Hope that helps someone! (Please Support My Research Grant - lol)

Aperio - Solvent Distributions

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Links

A Tale in the Desert

Wiki Archives

Tools