adds convergent mode to pick_open_reference_otus#1958
Conversation
|
Thanks for adding more documentation @gregcaporaso The problem that I found is that if the size of the input files are quite different (e.g. we have in the EMP input files of 80GB while others are less than 1 GB) once this files are processed, the amount of sequences per iteration that are analyzed is reduced. The change that I'm planning to do is to modify the number of sequences per input file included in each step dynamically; so in each iteration we can analyze approx the same amount of sequences. Does this sound reasonable to you @gregcaporaso ? Another change will be to allow the convergent mode also in a single input file; so we can analyze extremely large datasets in a convergent manner. Do you also agree with this change @gregcaporaso ? |
|
Build results will soon be (or already are) available at: http://ci.qiime.org/job/qiime-github-pr/1603/ |
|
Both of those sound like good additions, but I think you should focus on the first one since we have an immediate application (EMP). Does the process still seem to be working for that analysis? |
|
Yeah, I will focus on the first one. The process seems to be working correctly on that data. |
|
closing in favor of #1959 |
This replaces #1951.
We still need to do some more testing before this is merged though. @josenavas, how is the EMP run going with this? Can you confirm that all sequences are accounted for after different iterations as an additional test. The count of input sequences should be the same as the count of sequences in the iteration's OTU map before singleton filtering.