cappy
05.05.2012, 12:05
Bei Rosetta gibs ein Thread zu dem event CASP 10 was wir hier posten werden, mit vorläufiger google übersetzung für die wichtigeren posts, wenn jemand eine übersetzung hat dann bitte per pm an mich ich tausche sie dann aus
TJ Message 72941 - Posted 30 Apr 2012 19:37:35 UTC
Hello everyone !
CASP 10, a community wide experiment in structure prediction starts tomorrow on May 1st and runs to August 1st. During this time we will be using BOINC heavily for structure prediction. If your work unit starts with the label rb you're running a CASP 10 target! rb is short for Robetta which is our publicly available server for structure prediction.
CASP
CASP is an international experiment to assess the state-of-the-art of the protein structure prediction field. Sequences, whose structures have been solved but which have not yet been published are sent out to participating teams and we have a 3 days to send back predictions. The whole thing is conducted in a double-blind fashion ensuring fair assessment and truly blind prediction.
Robetta
Structure prediction for the community, by the community. Robetta is a server for protein structure prediction that shares Rosetta's structure prediction capabilities to the scientific community (and to the public). The computation for this will be conducted on BOINC meaning that you guys will be crunching protein structure prediction jobs for real scientific studies conducted by researchers all over the world.
Improvements since CASP 9
Over the last two years we have extensively modified our structure prediction methodology. Preliminary results indicate that we've made more improvement in the last two years than in the previous 6 years combined. For the first time there is significant doubt wether humans can improve upon the results from computers. So this could be a very exciting CASP.
Thanks again everyone for crunching, we wouldn't be able to do this stuff without you !
Excitedly yours,
Chris, Ray, Frank, Yifan, David Baker, David Kim, Hetu and TJ
Übersetzung: - susanne@seti.germany
Seid gegrü?t!
CASP 10, ein Community-weites Experiment in der Strukturvorhersage, beginnt morgen, am 1. Mai und läuft bis zum 1. August. Während dieser Zeit werden wir BOINC für Strukturvorhersagen stark belasten. Wenn eure Arbeitseinheit mit dem Kennzeichen rb beginnt, dann bearbeitet ihr ein CASP 10 Zielobjekt! rb ist ein Kurzwort für Robetta, unser öffentlicher Server, der für Strukturvorhersagen bereit steht.
CASP
CASP ist ein internationales Experiment um den topmodernsten Bereich der Proteinstruktur-Vorhersage zu beurteilen. Sequenzen, deren Strukturen zwar gelöst, aber noch nicht veröffentlicht sind, werden an teilnehmende Teams geschickt und wir haben dann 3 Tage Zeit, um die Vorhersagen zurückzuschicken. Das alles wird mit der Doppelblindmethode behandelt um eine faire Bewertung zu versichern.
Robetta
Strukturvorhersage für die Community von der Community. Robetta ist ein Server für Strukturvorhersagen, der Rosettas Strukturvorhersagevermögen mit der wissenschaftlichen Community teilt (und mit der Öffentlichkeit). Die Berechnungen dafür werden mit BOINC ausgeführt, was hei?t, dass ihr Jobs für Proteinstrukturvorhersagen cruncht, für wirkliche wissenschaftliche Studien, die von Forschern der ganzen Welt ausgeführt werden.
Verbesserungen seid CASP 9
Im Laufe der letzten zwei Jahre haben wir unsere Methodik der Strukturvorhersage weitaus modifiziert. Erste Ergebnisse zeigen, dass wir mehr Verbesserungen in den letzten zwei Jahren gemacht haben, als in den vorhergegangenen 6 Jahren zusammengefasst. Zum ersten Mal besteht der Zweifel, ob Menschen die Computerergebnisse verbessern können. Somit könnte CASP diesmal sehr aufregend werden.
Nochmals vielen Dank an alle fürs Crunchen, wir könnten diese Sachen ohne euch nicht machen!
Es grü?en gespannt,
Chris, Ray, Frank, Yifan, David Baker, David Kim, Hetu und TJ
----------------------------------------------------------------------------------------------------------------------------------
Sid Celery - Message 72949 - Posted 1 May 2012 4:52:43 UTC - in response to Message ID 72941.
CASP
CASP is an international experiment to assess the state-of-the-art of the protein structure prediction field. Sequences, whose structures have been solved but which have not yet been published are sent out to participating teams and we have a 3 days to send back predictions. The whole thing is conducted in a double-blind fashion ensuring fair assessment and truly blind prediction.
You state you have 3 days to send back predictions. Can I ask a very specific question that I've raised before:
The default work buffer set is 0.25 days with a 3 hour runtime, but some of us maintain a larger work buffer in order to avoid task outages. I personally use 2.0 days, but others may use a larger amount.
The default settings allow tasks to be returned to you in good time, but is it true to say that if the work buffer+runtime totals more than 3 days, then the work we grab will not be returned to you in sufficient time for the results to count?
I will assume this is the case, so I'm reducing my work-buffer to 1.5 days - plus my 8-hour runtime - to allow a certain leeway for you to receive work back in time. Please confirm so that others can make similar adjustments.
Obviously, with reduced work buffers, there's an equivalent requirement for tasks to be reliably available at your end, so an extra degree of monitoring would be wise.
On the assumption that my guesses are correct, you may see a reduced rate of task downloads while our buffers are run-down, though tasks wil be returned a certain amount sooner after release. As long as tasks are readily available there should be no reduction in results you see back.
----------------------------------------------------------------------------------------------------------------------------------
P . P . L . - Message 72954 - Posted 1 May 2012 5:31:17 UTC
CASP 10, a community wide experiment in structure prediction starts tomorrow on May 1st and runs to August 1st. During this time we will be using BOINC heavily for structure prediction. If your work unit starts with the label rb you're running a CASP 10 target! rb is short for Robetta which is our publicly available server for structure prediction
The only problem is I've been seeing these task names for weeks now, is there going to be some other way to tell which are really CASP tasks.
Something added to the task naming many be.
----------------------------------------------------------------------------------------------------------------------------------
David Baker - Message 72956 - Posted 1 May 2012 5:52:06 UTC - in response to Message ID 72949.
CASP
CASP is an international experiment to assess the state-of-the-art of the protein structure prediction field. Sequences, whose structures have been solved but which have not yet been published are sent out to participating teams and we have a 3 days to send back predictions. The whole thing is conducted in a double-blind fashion ensuring fair assessment and truly blind prediction.
You state you have 3 days to send back predictions. Can I ask a very specific question that I've raised before:
The default work buffer set is 0.25 days with a 3 hour runtime, but some of us maintain a larger work buffer in order to avoid task outages. I personally use 2.0 days, but others may use a larger amount.
The default settings allow tasks to be returned to you in good time, but is it true to say that if the work buffer+runtime totals more than 3 days, then the work we grab will not be returned to you in sufficient time for the results to count?
I will assume this is the case, so I'm reducing my work-buffer to 1.5 days - plus my 8-hour runtime - to allow a certain leeway for you to receive work back in time. Please confirm so that others can make similar adjustments.
Obviously, with reduced work buffers, there's an equivalent requirement for tasks to be reliably available at your end, so an extra degree of monitoring would be wise.
On the assumption that my guesses are correct, you may see a reduced rate of task downloads while our buffers are run-down, though tasks wil be returned a certain amount sooner after release. As long as tasks are readily available there should be no reduction in results you see back.
Yes you are absolutely right. for CASP we need results back within a day or two, as our approach is iterative: we analyze the results after one day and send out another set of wu based on these results for two days of computing, then collect the results and submit to CASP. so please do set your buffer to a shorter time, and let us know if you are running out of wu. thanks!
----------------------------------------------------------------------------------------------------------------------------------
Rocco Moretti - Message 72962 - Posted 1 May 2012 16:23:03 UTC - in response to Message ID 72954.
The only problem is I've been seeing these task names for weeks now, is there going to be some other way to tell which are really CASP tasks.
A large number of those workunits have been pre-CASP testing - that is, running the entries from previous CASPs through the CASP10 structure prediction machinery and checking that everything is working properly. Now that CASP has started, that testing is pretty much over (although there might be occasional tests to double check something, or to try a last-minute fix).
A small portion of those workunits were for structure prediction jobs which were submitted to Robetta by other research groups. But to conserve resources, that public submission is going to be disabled for the duration of CASP.
So if you see a rb task in the next few months, in all likelihood it should be for CASP.
----------------------------------------------------------------------------------------------------------------------------------
Sean Kiely - Message 72964 - Posted 1 May 2012 17:19:26 UTC - in response to Message ID 72956.
I would recommend that you post an item under "News" on the homepage (and also a new thread in the number-crunching forum) asking participants to check their work buffer settings and reduce them to no higher than 1.5 days? This might reduce the number of CASP units that are processed but not returned quickly enough to be useful.
----------------------------------------------------------------------------------------------------------------------------------
Sid Celery - Message 72966 - Posted 1 May 2012 18:11:49 UTC - in response to Message ID 72956.
Thanks for the quick reply. I didn't anticipate you did post-processing of results - is ~1.83 days (1.5days + 8 hour runtime) sufficient for you? What would your ideal maximum turnaround time be?
Another issue that arose last year was the fact that the BOINC manager doesn't help us adhere to a quicker-than-usual turnaround time because of issues like "debt" between projects (I'm not qualified to talk about this tbh but I know there's a factor involved). Personally I'll be setting WCG to "No New Tasks" for the duration as Rosetta is my primary project.
The biggest issue last year, though, was the "Deadline" we see in the Boinc Manager being set at 10 days from download - especially when a contributor runs more than one project (due to Para 2). Is there any way you can set the deadline for specific CASP10 tasks to your preference - your ideal maximum-turnaround time? That way, the BOINC manager will ensure that targets are met rather than (effectively) working toward missing them.
I notice this afternoon I've received a non "RB" task from Rosetta (ab_centroidAbrelax_cst_3qc7A) after the first rb tasks have come down. In order to distinguish between urgent and non-urgent tasks, CASP10 tasks should have (say) 2-day deadlines & all others the usual 10-day deadline. Can this be done from your end?
I can't think of any other issues that might prevent the CASP exercise from operating successfully.
----------------------------------------------------------------------------------------------------------------------------------
Aegis Maelstrom Message 72970 - Posted 2 May 2012 16:35:15 UTC - in response to Message ID 72966.
Last modified: 2 May 2012 16:36:12 UT
Hi, I second Sid Celery in his proposals.
My guess is that a majority of R@h crunchers is not meeting your 1-2 days deadline for a task requirement. To be effective they need to be forced (or at least informed) to change their behaviour. As the CASP10 has already started, we would need to inform them in a blink.
Furthermore, let's be honest - a vast majority of crunchers does not read information from the projects or their teams on a daily basis. They even lag severly in e-mail communication. Moreover, even if they learn about new requirements a fair amount of participants will forget, be unable etc. to adjust their crunching pattern.
In this situation changing the deadlines for WUs in addition to the information about the issue (very important for computers without permanent access to the Internet, used for a small amount of time per day, set on longer run times etc.) seems to be the best option.
That or sending CASP10 WUs strictly basing on behavioural patterns (only to "fast" crunchers, if their computing power is big enough).
Best Regards and Happy Crunching.
----------------------------------------------------------------------------------------------------------------------------------
Sid Celery - Message 72978 - Posted 3 May 2012 2:32:58 UTC - in response to Message ID 72970.
I would guess this isn't true actually. Most people will work with the defaults of 0.25 days buffer & 3-hour run-times, so in the main everything ought to be fine.
The problem will be that inveterate fiddlers (presumably like you & I) will have tweaked our settings. Hopefully they cast their eye over the forums too & will catch this wrinkle before long. At the same time these same people will possibly be those who dedicate more rsources to Rosetta, so it may make a disproportionate amount of difference. Just speculating obviously.
As long as we tweak things appropriately, everyone should get what they want.
It's also worth a shout to say when CASP10 is over so we can revert to our individual preferences afterwards.
TJ Message 72941 - Posted 30 Apr 2012 19:37:35 UTC
Hello everyone !
CASP 10, a community wide experiment in structure prediction starts tomorrow on May 1st and runs to August 1st. During this time we will be using BOINC heavily for structure prediction. If your work unit starts with the label rb you're running a CASP 10 target! rb is short for Robetta which is our publicly available server for structure prediction.
CASP
CASP is an international experiment to assess the state-of-the-art of the protein structure prediction field. Sequences, whose structures have been solved but which have not yet been published are sent out to participating teams and we have a 3 days to send back predictions. The whole thing is conducted in a double-blind fashion ensuring fair assessment and truly blind prediction.
Robetta
Structure prediction for the community, by the community. Robetta is a server for protein structure prediction that shares Rosetta's structure prediction capabilities to the scientific community (and to the public). The computation for this will be conducted on BOINC meaning that you guys will be crunching protein structure prediction jobs for real scientific studies conducted by researchers all over the world.
Improvements since CASP 9
Over the last two years we have extensively modified our structure prediction methodology. Preliminary results indicate that we've made more improvement in the last two years than in the previous 6 years combined. For the first time there is significant doubt wether humans can improve upon the results from computers. So this could be a very exciting CASP.
Thanks again everyone for crunching, we wouldn't be able to do this stuff without you !
Excitedly yours,
Chris, Ray, Frank, Yifan, David Baker, David Kim, Hetu and TJ
Übersetzung: - susanne@seti.germany
Seid gegrü?t!
CASP 10, ein Community-weites Experiment in der Strukturvorhersage, beginnt morgen, am 1. Mai und läuft bis zum 1. August. Während dieser Zeit werden wir BOINC für Strukturvorhersagen stark belasten. Wenn eure Arbeitseinheit mit dem Kennzeichen rb beginnt, dann bearbeitet ihr ein CASP 10 Zielobjekt! rb ist ein Kurzwort für Robetta, unser öffentlicher Server, der für Strukturvorhersagen bereit steht.
CASP
CASP ist ein internationales Experiment um den topmodernsten Bereich der Proteinstruktur-Vorhersage zu beurteilen. Sequenzen, deren Strukturen zwar gelöst, aber noch nicht veröffentlicht sind, werden an teilnehmende Teams geschickt und wir haben dann 3 Tage Zeit, um die Vorhersagen zurückzuschicken. Das alles wird mit der Doppelblindmethode behandelt um eine faire Bewertung zu versichern.
Robetta
Strukturvorhersage für die Community von der Community. Robetta ist ein Server für Strukturvorhersagen, der Rosettas Strukturvorhersagevermögen mit der wissenschaftlichen Community teilt (und mit der Öffentlichkeit). Die Berechnungen dafür werden mit BOINC ausgeführt, was hei?t, dass ihr Jobs für Proteinstrukturvorhersagen cruncht, für wirkliche wissenschaftliche Studien, die von Forschern der ganzen Welt ausgeführt werden.
Verbesserungen seid CASP 9
Im Laufe der letzten zwei Jahre haben wir unsere Methodik der Strukturvorhersage weitaus modifiziert. Erste Ergebnisse zeigen, dass wir mehr Verbesserungen in den letzten zwei Jahren gemacht haben, als in den vorhergegangenen 6 Jahren zusammengefasst. Zum ersten Mal besteht der Zweifel, ob Menschen die Computerergebnisse verbessern können. Somit könnte CASP diesmal sehr aufregend werden.
Nochmals vielen Dank an alle fürs Crunchen, wir könnten diese Sachen ohne euch nicht machen!
Es grü?en gespannt,
Chris, Ray, Frank, Yifan, David Baker, David Kim, Hetu und TJ
----------------------------------------------------------------------------------------------------------------------------------
Sid Celery - Message 72949 - Posted 1 May 2012 4:52:43 UTC - in response to Message ID 72941.
CASP
CASP is an international experiment to assess the state-of-the-art of the protein structure prediction field. Sequences, whose structures have been solved but which have not yet been published are sent out to participating teams and we have a 3 days to send back predictions. The whole thing is conducted in a double-blind fashion ensuring fair assessment and truly blind prediction.
You state you have 3 days to send back predictions. Can I ask a very specific question that I've raised before:
The default work buffer set is 0.25 days with a 3 hour runtime, but some of us maintain a larger work buffer in order to avoid task outages. I personally use 2.0 days, but others may use a larger amount.
The default settings allow tasks to be returned to you in good time, but is it true to say that if the work buffer+runtime totals more than 3 days, then the work we grab will not be returned to you in sufficient time for the results to count?
I will assume this is the case, so I'm reducing my work-buffer to 1.5 days - plus my 8-hour runtime - to allow a certain leeway for you to receive work back in time. Please confirm so that others can make similar adjustments.
Obviously, with reduced work buffers, there's an equivalent requirement for tasks to be reliably available at your end, so an extra degree of monitoring would be wise.
On the assumption that my guesses are correct, you may see a reduced rate of task downloads while our buffers are run-down, though tasks wil be returned a certain amount sooner after release. As long as tasks are readily available there should be no reduction in results you see back.
----------------------------------------------------------------------------------------------------------------------------------
P . P . L . - Message 72954 - Posted 1 May 2012 5:31:17 UTC
CASP 10, a community wide experiment in structure prediction starts tomorrow on May 1st and runs to August 1st. During this time we will be using BOINC heavily for structure prediction. If your work unit starts with the label rb you're running a CASP 10 target! rb is short for Robetta which is our publicly available server for structure prediction
The only problem is I've been seeing these task names for weeks now, is there going to be some other way to tell which are really CASP tasks.
Something added to the task naming many be.
----------------------------------------------------------------------------------------------------------------------------------
David Baker - Message 72956 - Posted 1 May 2012 5:52:06 UTC - in response to Message ID 72949.
CASP
CASP is an international experiment to assess the state-of-the-art of the protein structure prediction field. Sequences, whose structures have been solved but which have not yet been published are sent out to participating teams and we have a 3 days to send back predictions. The whole thing is conducted in a double-blind fashion ensuring fair assessment and truly blind prediction.
You state you have 3 days to send back predictions. Can I ask a very specific question that I've raised before:
The default work buffer set is 0.25 days with a 3 hour runtime, but some of us maintain a larger work buffer in order to avoid task outages. I personally use 2.0 days, but others may use a larger amount.
The default settings allow tasks to be returned to you in good time, but is it true to say that if the work buffer+runtime totals more than 3 days, then the work we grab will not be returned to you in sufficient time for the results to count?
I will assume this is the case, so I'm reducing my work-buffer to 1.5 days - plus my 8-hour runtime - to allow a certain leeway for you to receive work back in time. Please confirm so that others can make similar adjustments.
Obviously, with reduced work buffers, there's an equivalent requirement for tasks to be reliably available at your end, so an extra degree of monitoring would be wise.
On the assumption that my guesses are correct, you may see a reduced rate of task downloads while our buffers are run-down, though tasks wil be returned a certain amount sooner after release. As long as tasks are readily available there should be no reduction in results you see back.
Yes you are absolutely right. for CASP we need results back within a day or two, as our approach is iterative: we analyze the results after one day and send out another set of wu based on these results for two days of computing, then collect the results and submit to CASP. so please do set your buffer to a shorter time, and let us know if you are running out of wu. thanks!
----------------------------------------------------------------------------------------------------------------------------------
Rocco Moretti - Message 72962 - Posted 1 May 2012 16:23:03 UTC - in response to Message ID 72954.
The only problem is I've been seeing these task names for weeks now, is there going to be some other way to tell which are really CASP tasks.
A large number of those workunits have been pre-CASP testing - that is, running the entries from previous CASPs through the CASP10 structure prediction machinery and checking that everything is working properly. Now that CASP has started, that testing is pretty much over (although there might be occasional tests to double check something, or to try a last-minute fix).
A small portion of those workunits were for structure prediction jobs which were submitted to Robetta by other research groups. But to conserve resources, that public submission is going to be disabled for the duration of CASP.
So if you see a rb task in the next few months, in all likelihood it should be for CASP.
----------------------------------------------------------------------------------------------------------------------------------
Sean Kiely - Message 72964 - Posted 1 May 2012 17:19:26 UTC - in response to Message ID 72956.
I would recommend that you post an item under "News" on the homepage (and also a new thread in the number-crunching forum) asking participants to check their work buffer settings and reduce them to no higher than 1.5 days? This might reduce the number of CASP units that are processed but not returned quickly enough to be useful.
----------------------------------------------------------------------------------------------------------------------------------
Sid Celery - Message 72966 - Posted 1 May 2012 18:11:49 UTC - in response to Message ID 72956.
Thanks for the quick reply. I didn't anticipate you did post-processing of results - is ~1.83 days (1.5days + 8 hour runtime) sufficient for you? What would your ideal maximum turnaround time be?
Another issue that arose last year was the fact that the BOINC manager doesn't help us adhere to a quicker-than-usual turnaround time because of issues like "debt" between projects (I'm not qualified to talk about this tbh but I know there's a factor involved). Personally I'll be setting WCG to "No New Tasks" for the duration as Rosetta is my primary project.
The biggest issue last year, though, was the "Deadline" we see in the Boinc Manager being set at 10 days from download - especially when a contributor runs more than one project (due to Para 2). Is there any way you can set the deadline for specific CASP10 tasks to your preference - your ideal maximum-turnaround time? That way, the BOINC manager will ensure that targets are met rather than (effectively) working toward missing them.
I notice this afternoon I've received a non "RB" task from Rosetta (ab_centroidAbrelax_cst_3qc7A) after the first rb tasks have come down. In order to distinguish between urgent and non-urgent tasks, CASP10 tasks should have (say) 2-day deadlines & all others the usual 10-day deadline. Can this be done from your end?
I can't think of any other issues that might prevent the CASP exercise from operating successfully.
----------------------------------------------------------------------------------------------------------------------------------
Aegis Maelstrom Message 72970 - Posted 2 May 2012 16:35:15 UTC - in response to Message ID 72966.
Last modified: 2 May 2012 16:36:12 UT
Hi, I second Sid Celery in his proposals.
My guess is that a majority of R@h crunchers is not meeting your 1-2 days deadline for a task requirement. To be effective they need to be forced (or at least informed) to change their behaviour. As the CASP10 has already started, we would need to inform them in a blink.
Furthermore, let's be honest - a vast majority of crunchers does not read information from the projects or their teams on a daily basis. They even lag severly in e-mail communication. Moreover, even if they learn about new requirements a fair amount of participants will forget, be unable etc. to adjust their crunching pattern.
In this situation changing the deadlines for WUs in addition to the information about the issue (very important for computers without permanent access to the Internet, used for a small amount of time per day, set on longer run times etc.) seems to be the best option.
That or sending CASP10 WUs strictly basing on behavioural patterns (only to "fast" crunchers, if their computing power is big enough).
Best Regards and Happy Crunching.
----------------------------------------------------------------------------------------------------------------------------------
Sid Celery - Message 72978 - Posted 3 May 2012 2:32:58 UTC - in response to Message ID 72970.
I would guess this isn't true actually. Most people will work with the defaults of 0.25 days buffer & 3-hour run-times, so in the main everything ought to be fine.
The problem will be that inveterate fiddlers (presumably like you & I) will have tweaked our settings. Hopefully they cast their eye over the forums too & will catch this wrinkle before long. At the same time these same people will possibly be those who dedicate more rsources to Rosetta, so it may make a disproportionate amount of difference. Just speculating obviously.
As long as we tweak things appropriately, everyone should get what they want.
It's also worth a shout to say when CASP10 is over so we can revert to our individual preferences afterwards.