BIP-03: Culling The Fragment Pool

TheMuffinMan

Summary
I’m proposing that we undertake a massive culling of legacy Botto fragments that have not performed well during their time in the Fragment Pool. The Fragment Pool should have a heavy focus on newer pieces since they are created from the most advanced/highly trained version of Botto.

Background
Reducing the fragment pool has been discussed numerous times but I’m not aware of anything being implemented. My current understanding is that Botto adds 350 fragments to the pool each week. Being in week 19 of the Botto process means that there are 6,632 fragments in the current pool ((350*19) - 18 = 6,632) and this number will continue to grow by 349 each week.

Rationale
The current fragment pool is unmanageable for even the most active voters. With over 6,600 fragments currently in play, an active voter could easily go through consecutive voting cycles without seeing a single piece that ends up on the leaderboard. What this unfortunately means is that our leaderboard isn’t actually a selection of the most desirable pieces but rather a representation of which pieces were lucky enough to be shown more frequently to Bottonians that spend more votes during the first portion of the voting cycles. Drastically reducing the number of fragments in the pool will increase the visibility of new pieces, create the opportunity for a better-vetted leaderboard, and train Botto more effectively.

Proposal Specifications
Based on feedback from the community and the team, I’m updating the implementation specifications of my proposal to be broken into 2 categories (and assuming a Round 22 roll-out):

The Initial Cull:
a. Remove all fragments from the pool that are more than 3 weeks old and have a True Score in the bottom 95%.
i. New Fragment Pool Size = (350 x 3) + .05(350 x 19) -21 = 1,361 Fragments
Ongoing Cull:
a. At the end of each voting period, remove the 350 fragments with the lowest True Score that have been around for more than 3 weeks.
i. To better enable this strategy this will also entail reducing the required views to determine True Score from 100 to 50

Fragment Pool Composition After Roll-Out:

26% brand new fragments
51% pieces under time protection
23% legacy Botto fragments

Here are some additional notes/clarifications on why the proposal has been updated this way:

The time protection for pieces in their first 3 weeks has remained due to Quasimondo stressing the importance for some element of time protection.
I’ve elected to use the True Score over the leaderboard for retaining high potential fragments in the pool. A common trend in the feedback was that retaining fragments based on the leaderboard wasn’t the desired approach.
This change would reduce the current voting pool in Round 22 from 7,700 to 1,361 (approximately an 80% reduction) and the size of the fragment pool will stay the same week over week.
The team was concerned with how small the fragment pool would be in my initial proposal, this new proposal will keep an additional 100 fragments in the pool.

Important Implementation Note: Fragments entering circulation for the first time are considered as Week 1.

Advantages
• Voters will have a more direct impact on the leaderboard each week and will place more value on voting prior to the leaderboard selection
• The quality/desirability of the leaderboard will be increased
• The Fragment Pool will be more manageable for voters
• Voters won’t have to vote on the same undesirable pieces week after week because they never go away

Disadvantages
• Sticker shock of recommending an 80% reduction in the size of the Fragment Pool

Ben

Crossposting my reply from Discord

I like this a lot. Right now, a lot of gems from past leaderboards get buried in voting under all the new fragments and never again see the light of day.

So I think this would greatly improve the signal-to-noise in voting, improve the avg quality of the winning fragments, and improve the UX for voters by making it less overwhelming and only displaying the most promising pieces.

I even wonder if it could be further filtered by either releasing under 350 fragments per week, say something more like 150, or additionally culling the 50% worst performing fragments by votes from the prior two weeks.

TheMuffinMan

Made a clarification in the Proposal Specifications section:
• Fragments after week 3 that qualify as "on the leaderboard" are the 15 fragments that comprise the starting leaderboard each week.

choobie

Hey, great timing on this proposal. Let me preface this post by stating that I generally agree with this change. Deconstructing your post:

The Fragment Pool should have a heavy focus on newer pieces since they are created from the most advanced/highly trained version of Botto.

This is true, and definitely occurs earlier on in the week during the voting process.

The current fragment pool is unmanageable for even the most active voters. With over 6,600 fragments currently in play, an active voter could easily go through consecutive voting cycles without seeing a single piece that ends up on the leaderboard.

I agree that the current fragment pool is largely unmanageable for the most active voters, but this is not only because of the sheer quantity of fragments in the voting pool at any given stage. It also has to do with the quantity of active voters voting earlier on in the week - It takes longer for fragments to be attributed a true score (this occurs when the fragment receives 100 unique views), meaning that new fragments remain in a voter's consideration set for a longer period throughout the week. This is because they automatically retain a score of 100 prior to receiving their true score.

What does this imply, then?

Voters see high-scoring fragments (the latest from the week) for too long a period
Leaderboard presence is effectively skewed at the end of the week from this (lack of true) scoring behavior
Culling the fragment pool will have a positive effect, but not before scores normalize (after 100 unique views)
We should allow for fragments to attain their true score before 100 unique views
We should look at mechanisms that incentivize voting earlier in the week, or more generally, incentivize voting more as a whole (I am greatly in favor of this). Culling the fragment pool by 80% will not have direct impact on voting behavior (the trend here is that generally more voters pop in towards the end of the round - this is unlikely to change from a cull).

Drastically reducing the number of fragments in the pool will increase the visibility of new pieces, create the opportunity for a better-vetted leaderboard, and train Botto more effectively.

New piece are already 'drastically visible'. The issue seems to be that we take too long to sort through them, and they're beginning to pile up: too many high scoring ones retain their high score throughout the week and are shown 80% of the time, while too many 'low scoring' ones occupying the other 20% of time being shown (i.e. they're being given redemption arcs).

We should commit to changes in two phases, where:

Phase 1:

75% reduction in fragments sorted by VP (so, lowest VP fragments get removed)
Occurs on round 22
We'd end up with ~₁₈₀₀ fragments in the pool less winners
Excluding the current round
350 added each week
350 removed each week

75% is the recommended number as this is a number we have already set up internally, it's just a case of setting up new buckets to preserve the discards. Timeframe we recommend Round 22 as this is most feasible for the team.

Phase 2

Reduce the time it takes for a true score to be attributed to an individual fragment, from 100 unique views to 50 unique views.

I recommend this change occurs on Round 23 if we see no improvement in the voting pool. This change mitigates the duration of impact that the new fragments have in the voting pool, meaning faster attribution of score. The bias that this might cause is that older fragments with higher scores are more frequently shown alongside very prominent/(true) high-scoring fragments from the latest round (which seems to be a desirable 'bias').

If these two phases do not mitigate the frequency of new round fragments in the voting pool, I believe we might need to re-evaluate the scoring mechanism. I am quite confident that these two changes will have the desired effect.

TheMuffinMan

choobie

First off - thank you for the response! I enjoyed reading your critiques on my proposal and how we can effectively address the fragment pool size problem.

I want to dive straight in to where it seems like our opinions drastically diverge- "but this is not only because of the sheer quantity of fragments in the voting pool at any given stage. It also has to do with the quantity of active voters voting earlier on in the week."

I strongly believe that active voters are scarce earlier in the week because voting during that period feels pointless. There are so many fragments, you don't see most of them, and the leaderboard is likely completely different than the pieces you've seen. It's an unmotivating sentiment for voters. Whereas later in the week, you know that your votes on the 15 pieces matter; you have a direct impact. People are not AI; they care about emotions, feelings, and impact whereas Botto does not.

You and I seem to be speculating opposite sides on user behavior here, neither of us will know who is correct unless we make a change and evaluate; but I believe that this change will provide some motivation to voters to be more active earlier in the week if it is effectively communicated to them.

I'm not sure that I fully understand your true score/100 unique views, however it seems like with a smaller fragment pool and motivation to vote earlier in the week, the true score will be determined faster.

Lastly, I still prefer my method of fragment pool reduction to the specifics you outlined in Phase 1. I think the fragment pool should be much smaller and sorting it by lowest VP will be unfavorable to newer pieces since they have had less time in circulation.

hudsonsims

TheMuffinMan you can find the explanation of the scoring here: https://docs.botto.com/details/voting-mechanism

tl;dr is that pieces are given a score out of 100 based on voting, 80% of the time higher scores are shown and 20% lower scores are. new fragments are given a default score of 100 until they are viewed 100 times to ensure they are seen and have a fair sample from which to be given a fair score.

What this unfortunately means is that our leaderboard isn’t actually a selection of the most desirable pieces but rather a representation of which pieces were lucky enough to be shown more frequently to Bottonians that spend more votes during the first portion of the voting cycles.

This ^ isn't quite right. Based on the scoring mechanism, new pieces are actually guaranteed to get viewed 100 times at the beginning of the round. But, it is true the early voters have significant influence on that critical early momentum, which actually conflicts with your reasoning that people don't have enough incentive to vote early in the round.

Culling formula
I disagree with removing fragments that haven't made it to the leaderboard as this goes against your own reasoning of the leaderboards not currently being representative.

@choobie's proposal to remove by VP I think makes more sense to me, but does need some mitigation so as to normalize across rounds.

Instead can retroactively remove bottom 75% based on votes cast just in the last round.
We could adapt this if the feeling is that the voting is already too skewed and we'd lose some favorable pieces (though this leaves 1600+ fragments, so feel like there's plenty). For instance, apply the 75% cut to round 10, then from there only have remove bottom 75% of fragments created that round up to present round.

Then going forward can remove bottom 350 of each individual round (equates to _bottom 20%)

Scoring formula
As for adapting the scoring to determine what gets shown, I think it's too early to say. The views are too divergent here: one rationale is that there's a lot of junk in the new set that get in the way of promising pieces from previous rounds, the other is that newer is presumably better. I tend to agree with there being a lot of random stuff in each new set (along with promising stuff) and it wouldn't be bad for them to be sorted more quickly so that other pieces have a chance at catching early momentum in the round. But there needs more discussion and doesn't need to be included in this proposal.

Generally agree that more voting is going to help everything, and could help draw attention to the influence of voting earlier in the round.

anonymeth

Agree with this! Thanks for the write up.

TheMuffinMan

hudsonsims

Thanks for the response! I liked some of your points and appreciated the concise explanation of the voting-mechanism.

It seems like altering the true score is a different can of worms and would be best suited for another proposal, although it is very interesting.

I didn't follow your comment "I disagree with removing fragments that haven't made it to the leaderboard as this goes against your own reasoning of the leaderboards not currently being representative." I don't think I made that comment in the way you may be interpreting it.

Tbh, I'm not sure your counter with the 75% removal combined with the bottom 20% offers any noticeable benefit over the initial proposal, it seems like a matter of personal preference.

mantis

I generally agree with the spirit of this proposal. I think perhaps we need a better way of categorizing styles and subjects that are not popular and cull those we don't like multiple at a time -- Botto should stop getting any positive feedback on those.

TheMuffinMan

mantis

Thanks! I would speculate it would be a much longer process to vote on what to do with the styles/subjects and split things that way. It would make for an interesting proposal!

However, in this case I think rolling something out before the fragment pool gets much larger is important - so speed matters.

juanje

My main take after reading all this is that an average Bottonian (like me) is not well-positioned to cast vote on this proposal. Meaning: I mostly agree with the goal and the rationale but ignore which specifications are the best implementation. So in a wonderful world, I would delegate my votes here to the team, who truly know the inners of the voting mechanisms. Also in a wonderful world too, if you wish, TheMuffinMan, you mightaksi engage more closely with them to see what can/should be done. The phased proposal mentioned by Choobie seems realistic and apparently has gathered some support within the team, so I would go for that. Also, I don´t know whether Quasimondo should have a closer look at this....

quimp

I agree with juanje, with an added note that I generally prefer to keep systems as simple as possible and readjust as needed. This is likely only the first of many proposals on culling.

Lastly, I would want to make sure that we tap into Quasimondo's expertise and vision before moving to a vote.

choobie

I want to dive straight in to where it seems like our opinions drastically diverge- "but this is not only because of the sheer quantity of fragments in the voting pool at any given stage. It also has to do with the quantity of active voters voting earlier on in the week."

Firstly, regarding the quantity of active voters - this is not my opinion, this is based on data from the admin panel. We take too long to sort through the fragments early on in the week because they take too long to score. They take too long to score (meaning they retain a temporary score of 100) because they never hit 100 unique views until it's leaderboard time. This has happened frequently over the course of the past few weeks. And this is why the phase 2 the team proposed is crucial in the short-term while we make voting more attractive earlier in the week, which is something all of us want to address! For one, I would suggest we change the speed in which a fragment gets its true 'Score'. This should either act as a precursor to any culling and/or be embedded into your proposal.

The Fragment Pool should have a heavy focus on newer pieces since they are created from the most advanced/highly trained version of Botto.

Secondly, I actually think this is the key divergence point between what the team wants to do and your proposal. The fragment pool already has a heavy focus on newer pieces - because of the scoring system and the 80/20 priority (80% of the time, top scoring fragments appear). This 'priority' in newer pieces has already been accommodated for through the implementation of scoring new pieces at 100 until they reach 100 views.

Based on your proposal, by preserving 3 weeks worth of the latest 350 fragments in the voting pool:

There will still be 350 new pieces per week that appear in the voting pool, just like before. This has no effect on focusing on the newer pieces. The voting pool will still force users to first comb through the latest 350 pieces to assign them a score, and we will still see a lot of these pieces appear in the leaderboard because they don't attain their true score in time (due to little voting activity).
Assuming the community successfully does comb through these 350 pieces and attribute them a score (say, because of an uptick of voters earlier in the week), then ii) 80% of the time, top performing fragments from previous rounds (the 240+15 that make it to the leaderboard each week) will be paired against one another over and over again, with the other 20% of the time lower scoring ones attempting to make a resurgence. This is not speculating user behavior - this is what will happen based on Botto's current system.

While this means we'll likely vote for fragments that might have just narrowly missed out on a mint (this is speculation/an educated guess of what will happen). But it also means that we'll likely be voting on the same or similar pairs over and over again at the tail end of the week, in perpetuity or until the period ends (assuming we make it past the 100/100 score of the latest round fragments).

I'm not sure that I fully understand your true score/100 unique views, however it seems like with a smaller fragment pool and motivation to vote earlier in the week, the true score will be determined faster.

Hopefully this is clarified above - the tl;dr here is that no, likely none of your proposal in its current form will directly affect how fast we achieve that true score value. We might see an uptick in how many users vote just to 'try out' the new implementation, but I surmise that this will fall off quite quickly after the first week or two.

This proposal disregards the majority of data we have available about each fragment across all weeks in favour of offering 'protection' to the latest 1050 fragments. The data I am referring to here are scores of each fragment as well as VP expenditure on each, which I strongly believe should be taken into consideration for any removal of fragment from the voting pool. It beckons the question: Why did we, as a community, vote for 20+ weeks if we don't use our votes to directly cull the tail-end of the fragment pool? And while yes, leaderboard data is based on round votes, it's been heavily skewed to newer fragments because of the [lack of] score attribution. We have the ability to cull based on existing voting data, i.e. based on Score or all time VP expenditure on a specific fragment.
This proposal does not take into consideration that a leaderboard did not exist in the first few weeks of the project.

A (small!) drawback here is also developer time, this is not as straightforward relative to other options at our disposal.

I suggest OP updates their proposal with the following considerations in mind:

There is duality to this proposal - instead of discussing removing historically poor performing fragments, we are talking about introducing a protectionist mechanism for the 3 latest rounds worth of fragments.
Existing data is available on every fragment, with these datapoints signalling historical voter preference
The scoring of latest round fragments is crucial to understanding how this proposed system plays out, and whether any proposed system plays out all

chrisrauh

I also support removing leaderboard considerations from the culling process. Seems like that is not necessary to achieve the goals and it makes the implementation more straightforward.

TheMuffinMan

Cross-posting from Discord:

I’ve taken additional time to review @choobie suggested changes to my proposal, and I still have concerns regarding this recommendation that I will outline here (I will use Round 19 for consistency with my proposal rather than Choobie’s Round 22):

Choobie’s recommendation highlights:

75% cull by total VP
Going forward each week 350 new fragments in, 350 old fragments out

What would this look like?

1925 Fragments in the pool (350 + ((35018).25))
a. I believe this is still too large to be manageable
The other piece is 350 fragments in -> 350 fragments out. But what are the 350 fragments that are leaving?
a. The 350 lowest total VP fragments are likely to be the ones that were just added the previous week since they’ve been in circulation for the shortest duration. This will mean that new fragments come in week 1, they get culled before week 2, and the fragment pool largely stays the same stale pieces that they were before + the fragments for the new week.

Another point of the discussion was using True Score rather than Total VP. This does seem to be a better option than the previous one, assuming we can quickly and consistently determine the True Score on the new fragments each week. Choobie recommended reducing the required unique views from 100 to 50, which does seem to be a positive change.

The potential negative is that if this is a critical piece of determining culling, it should rely on more than 50 people; therefore the total fragment pool needs to be very small.
Based on my thoughts above, if we were to revise my initial proposal, I feel that it would have to be:

Cull 90% of previous fragments based on lowest True Score
a. A delivery ETA of Round 22 would mean: Fragment pool = (350 + .1(350*21) - 21) = 1,064 Fragments
On an ongoing basis, each week cull the lowest 350 Fragments by True Score each week prior to adding in the new 350 fragments
True Score required views is reduced from 100 to 50

As a result, the Fragment Pool size would stay static at 1,064 Fragments for the reminder of the Genesis Period.

Please share your thoughts on this route relative to my initial proposal to determine which we should go with.

hudsonsims

TheMuffinMan
This conclusion looks pretty good to me.

Personally would prefer 80% rather than 90% -- from my look at the scoring data, below 80% is where we start to see crappy pieces, which makes sense with the pareto 80/20 rule. 75% is where we guarantee not losing any promising pieces. I don't think the difference of 500-900 fragments in the pool is going to mess with the goal , whereas changing the scoring function as choobie suggested will make the difference. We can always cull more later.

I'll defer to @choobie on whether VP or total score is more accurate of a metric for the culling goal since he knows much better what they represent.

Agree with removing round's lowest 350 going forward (think this was what was actually suggested by choobie originally, so don't see any other disagreement arising there) and changing the scoring threshold from 100 to 50.

TheMuffinMan

Based on Feedback, I've updated the Proposal Specifications section of this proposal:

Proposal Specifications
Based on feedback from the community and the team, I’m updating the implementation specifications of my proposal to be broken into 2 categories (and assuming a Round 22 roll-out):

The Initial Cull:
a. Remove all fragments from the pool that are more than 3 weeks old and have a True Score in the bottom 95%.
i. Fragment Pool = (350 x 3) + .05(350 x 19) -21 = 1,361 Fragments

Ongoing Cull:
a. At the end of each voting period, remove the 350 fragments with the lowest True Score that have been around for more than 3 weeks.
i. To better enable this strategy this will also entail reducing the required views to determine True Score from 100 to 50

Here are some additional notes/clarifications on why the proposal has been updated this way:

The time protection for pieces in their first 3 weeks has remained due to Quasimondo stressing the importance for some element of time protection.
I’ve elected to use the True Score over the leaderboard for retaining high potential fragments in the pool. A common trend in the feedback was that retaining fragments based on the leaderboard wasn’t the desired approach.
This change would reduce the current voting pool in Round 22 from 7,700 to 1,350 (approximately an 80% reduction) and the size of the fragment pool will stay the same week over week.
The team was concerned with how small the fragment pool would be in my initial proposal, this new proposal will keep an additional 100 fragments in the pool.
Important Implementation Note: Fragments entering circulation for the first time are considered as Week 1.

choobie

Based on your current proposal, we could put it to a vote as I do see this proposal is committed to time protection. In case you're willing to reconsider elements of your proposal, I'm going to mention a few things on the latest implementation specifications:

The Initial Cull:
a. Remove all fragments from the pool that are more than 3 weeks old and have a True Score in the bottom 95%.

If we take score as the method for culling, preserving 3 of the latest rounds of fragments doesn't make sense - they've achieved their true score at the end of each week. All this does is bias culling to preserve recent rounds worth of fragments in the to-be fragment pool, rather than actually using the scoring system as intended.

Ongoing Cull:
a. At the end of each voting period, remove the 350 fragments with the lowest True Score that have been around for more than 3 weeks.

If we have a fragment pool of +-1350 fragments, we're in actuality committing to culling the oldest fragments. This is because we'd be 'time protecting' the latest 3 rounds worth of fragments (1050 pieces, or 3x350 - you state 'more than 3 weeks old'). Have I misunderstood something here? Clarification would be highly appreciated so I don't misunderstand what's being proposed. Irrespective of this, culling (be it initial/ongoing) by true score makes sense and is consistent with current system implementation.

In terms of culling 95% of fragments, we'd be looking at an even smaller pool than initially stipulated in your proposal - I believe your original proposal stated 80%. The current pool size (including the round that's just started) is 6808. If we deem the latest round (Round 21, 03 March 2021) of fragments untouchable, we'd be looking at (0.2*6458)+350 ~= 1642 fragments. Ultimately I'm not sure whether this difference is negligible but I'd err on the side of caution with culling too much! While I believe the 'sweet spot' for pool size is ultimately an arbitrary number, cutting it too aggressively might negatively impact voting experience in terms of frequency of pieces (we already experienced this in the earlier weeks of Botto, even after a bugfix). We should also be mindful that this isn't a simple case of voters cycling through fragments until they've seen every piece.

Combined with the 3 round time protection element, we'd also be retroactively culling many fragments that already have normalized true scores (because there's ample voting data on them) for the sake of protecting fragments that have lower, more volatile true scores. The current trend with more recent round fragments is that their (current) true scores will further normalize at lower values, as seen with older fragments from earlier rounds. A simple reason for this would be that some 'get tired' of seeing the old crowd favorites, so they vote on newer fragments. Between the lines, that means we'll be preserving newer fragments that are already 'deemed' ugly by the community.

Alternatively put, older fragments that we can dub 'crowd favorites' have normalized scores at lower values relative to newer fragments, meaning newer fragments with fresher true high scores are prioritized in being shown 80% of the time. This means older fragments have their 'backs against the wall', while newer fragments have initially higher true scores that are volatile and more susceptible to normalizing at lower values anyway (unless they are deemed incredible fragments and have a very high appreciation rate over an extended period of time).

Looking at 3 week/round time protection in isolation: This discredits the scoring system we've used for 20+ weeks where the clear noticeable issue is that we are taking too long to score the most recent fragments, not that the system doesn't work. The current system already has time protection baked into it: Newer fragments from each week are scored by default at 100, meaning their visibility is high prioritized at 80% throughout the earlier stages of the week. This in itself is time protection: we want to give newer fragments a 'fair' chance at being reviewed by the community in and through their votes. Even then, when true scores of the latest fragments are eventually attributed, they are likely to be higher relative to older fragments (see previous argument). This means that without the time protection you've proposed, the system will cull an old fragment anyway - unless there is clearly something worse from the newer batch of fragments to take its place.

I apologize if this post comes across as convoluted, I'm happy to try clarify what I've written (also wrote this pretty quickly so I hope it doesn't read as gibberish). My TL;DR here is we should not commit to other measures of time protection as the system does accommodate for it already.

TheMuffinMan

choobie

There's a lot going on in this response, I know we've already discussed a lot of it. Here's my problem with some of your current comments:

It now seems like you're concerned that we'll lose the best/crowd favorite legacy fragments if we cull based on true score. When this proposal was first created it was going to leverage the leaderboard, which had the crowd favorites, and you opposed that too. It really seems like you will oppose the culling of any fragments, even when your rationale is contradicting itself. At this point, I'm not sure there's anything I can say or do that you won't have an issue with, so the proposal is what it is.
You really need to discuss time protection internally with Quasimondo - he specifically highlighted that time protection was a requirement. Clearly, there are diverging opinions internally that should be addressed for the good of this project.

Lastly, if time protection turns out to not work as intended, it could be easily removed in a subsequent proposal. You don't currently have any data to support that time protection won't work, or at least you haven't shared any.