Results! - The G5-G6 Challenge - An actual comparison

I never said my analysis was statistically significant. My analysis is the analysis I would perform for myself to make my healthcare decision. I would never tell members of this forum to do a t-test before making decisions on their health. I prefer, and will not, discuss statistical significant conclusions any further.

1 Like

I wasn’t replying to you, I was replying to Jbowler’s claims about statistical significance.

1 Like

Indeed, I fully agree with this. I am not claiming that Eric should do any statistical tests. I am simply disputing someone else’s claims.

2 Likes

I think you need to perform something like a t-test to find out.

We’re talking in a forum that generally doesn’t want to evaluate based on statistical analysis, Isn’t this your comment?

I think @Boerenkool was just responding to the quote below.

I was sort of responding to both boerenkool and jbowler when I said I didn’t think a t-test or similar statistical test was necessary in this situation.

I’m not sure if that is completely true. People often post studies on this forum which usually include some form of statistical analysis, and I think people value a statistical analysis over an anecdotal statement.

It just doesn’t really seem necessary here to dive that deep because the analysis is purely for Eric.

Eric certainly could take a stab at it if he felt it was necessary. Or seek out advice on the best method to prove statistical significance. Seems like a lot of work though when he seems pretty comfortable with his conclusion/assessment on the results and the analysis is specific to him- a bit anecdotal to begin with.

2 Likes

I don’t think so. I’m not considering @RCA221’s analysis; that certainly raises questions for people using a Dexcom CGM and MDI. I’m primarily interested in this statement:

It might help to put the results in engineering terms, comparing the G5 and G6. As @RCA221 pointed out using the average of the readings means that negatives and positives cancel out. This is why I quote the standard deviation; this is a measure of the swings:

G5: -2% (2% lower than the meter) +/- 17%
G6: +14% +/- 17%

So, yes, over the test period the G6 was reading, on average, 14% over the meter and the G5 2% below (in fact G6 averaged out to reading 19% higher than the G5), however notice that the errors were the same in both cases.

I don’t think anyone would challenge the idea that the G5, the G6 and the Contour Next One meter readings are all functions of the actual blood glucose, and, anyway, it is easy to show they are beyond any reasonable doubt. It’s also easy to show that the errors with either G5 or G6 don’t vary with the meter reading of blood glucose; here is a visual representation obtained by sorting Eric’s results by the meter reading:

image

Ok, so those are absolute values, not percentages, maybe the size of the errors does increase with BG, but the errors are so big it is really difficult to tell visually. The figures for the absolute errors are:

G5: -3.8mg/dl +/- 16.0mg/dl
G6: +11.2mg/dl +/- 15.4 mg/dl

What is going on here, and the root of the point I am trying to make, is that those errors are so massive that if Eric were to repeat the identical test for another 20 days, another 126 readings, the whole picture could be swapped for the G5 vs the G6; i.e. the G5 could be reading over and the G6 under.

We can also look at how the G5 and G6 correlate with the meter across the range of blood sugars measured by doing precisely what Dexcom apparently does when calibrating an individual sensor (but doing it for the whole set of measurements in each sensor class). This shows that the best approximation over the whole range is:

G5 = 10mg/dl + meter * 0.86
G6 = 18mg/dl + meter * 0.93

I’ve observed something like this myself with the G6 (this is anecdotal); the Contour Next One reports a lower BG at high BG than the sensor seems to. The figures suggest that the G6 is better than the G5 in this regard, at the cost of reading 8mg/dl higher all the time but this is totally unprovable from this dataset.

So what is going on? The errors look random, but that is because I sorted the data on the meter BG. Eric did a meter test a little over six times a day, when I look at the percentage errors they don’t seem to change much between readings (so I’m eyeballing the data - just trying to guess what is going on). My hypothesis (the next step after the eyeball) is that both the G5 and the G6 respond to recalibrations, either manual or internally and it is this that produces the changes in the error. I.e. the errors are not random, they are the result of the changes introduced by the recalibration. I produced this graph of % error over time (well, reading) and I added trendlines based on the moving average over 6 readings (1 day); I can do it again based on 3 (Eric recalibrates the G5 every 12 hours) if anyone wants (it’s a couple of mouse clicks), but I’ll explain why I chose 6 below.

Ok, so that is a bit busy, but notice how much busier the G5 line is. This is why I did the trendline over 6 readings; two of Eric’s calibrations and an unknown number of the G6’s automatic recalibrations. I wanted to try to smooth out the recalibration period, which, based on sampling theory, meant I had to average over more than the time between recalibrations; 6 samples was apparently fairly safe.

So that graph is really weird. The G5 is apparently switching between ‘over’ and ‘under’ quite regularly; we only have the data corresponding to the time of a meter BG so what we are looking at is an aliased view which exaggerates the effect (think of '50s television presenters with bad tie choices, or fishnet stockings if you are more Rocky Horror Picture Show). The G6 is much smoother though it still shows very significant glitches compared to the trendlines.

When I look at my own blood glucose over a 24 hour period (as reported by a G6) I do see swings but they are not rapid enough to cause aliasing sufficient to explain the wackiness of that graph. The swings in my BG would have to be happening more frequently than once every four hours for that.

Anyway, this requires more consideration on my part. A copy of my current spreadsheet (xlxs format; i.e. XML and there is no VB in there) is here:

https://drive.google.com/open?id=19GsOUaLhDpeed33nxbUwkLu-xWpvzjBz

You probably need to download that and load it into XL, though Google Sheets seems to be sort-of functional.

2 Likes

Thanks very much for providing your data. The more real data we have to analyze the better we can identify issues for discussion. I will take a look at your data tomorrow. A bit late for me tonight. Thanks again.

1 Like

Exactly and I quoted the part of jbowler’s comment I was responding to, so I’m not sure where the misunderstanding comes from.
The conversation basically went like this: Eric presented his analysis. Jbowler replied that the difference wasn’t significant. Then I doubted whether jbowler’s reasoning was correct and we had a short conversation about that.
Again, I don’t care whether the difference is significant and I don’t think Eric has to care either, but someone else makes claims about the significance without doing appropriate statistical tests and I dispute those claims.

2 Likes

I think you do have to do more than looking at the means and standard deviations to draw conclusions about the statistical significance of the difference. Even if the difference between the means is smaller than the standard deviations, the difference may be statistically significant. Take a look at this Nature Methods paper, figure 1B in particular: Error bars | Nature Methods

2 Likes

Ah, thanks. That helps a lot. In particular it documents the effect of the sample size (126) on the accuracy of the estimate for the mean; in this case the mean error calculated for the G6 vs G5 sensors. That factor, 1/sqrt(126), means that the error in the mean (as opposed to a single sample) is more like +/-1.3mg/dl. So I think that does indicate a near certainty that the G6 sensors are reading higher than the G5 ones, despite the large errors in individual readings shown in both sets of sensors.

Of course there are only 5 sensors in the sample set; while the number of samples (126) is large the number of sampled sensors of each type is very small. In all likelihood the sensors came from the same lot. All the same the trendlines on my second graph don’t seem to have any discontinuities around the time the sensors must have been changed.

1 Like

I was mistaken in assuming you were providing your data comparing G5 and G6 readings when you were providing Eric’s data in a “Sheets” format. I managed to access Eric’s data based on his instructions to convert his data on this thread to a .CSV extension and thus I have already provided my analysis. But thanks for sending me Eric’s data in another useful format.

1 Like

Eric, I will be getting my first G6 today and was planning on doing a G5/G6 test for 10 days. Just realized I will have to use two separate sensors. Sometimes sensors are fairly inaccurate and I can’t wait to change them. In essence, no two sensors are likely to give you the same readings. So now I think doing this comparison is like comparing apples to oranges.

2 Likes

I ran the G5 through 10 days, so that for both the G5 and G6 sensors, the first day and last day would be the same. I thought that was the most fair way to do it.

Otherwise, day 1 on the 2nd G5 would be day 8 on the G6. And day 4 of the 2nd G5, would be day 1 on the 2nd G6 (if you are following all of that!!)…

So it seemed to me that the best way of doing it would be to run them both for 10 days.

Hope my explanation makes sense.

I wasn’t questioning your study at all or syncing up the days. I am about to do a similar test and am concerned that since there are two separate censors, one could be more accurate than the other. For example, any two sensors including any two G5 sensors or two G6 sensors could have different results. So would I really be comparing the G5 vs G6 when the sensors themselves, not whether they are G5 or G6, could potentially throw off the accuracy? Anyway that is my concern. That didn’t concern you?

Well, my thought was this - It’s the devil you know…

I am familiar with the G5. Maybe the G6 would be a big improvement. Impress me, G6! Make me want to switch. Give me a reason!

It had 2 chances. It did not convince me.

Maybe on 4 tests or 6 tests or 10 tests I might be convinced.

But since I got 2 free ones, and it did not win me over, that was enough for me to stay with what I have.

This wasn’t really a science experiment for the masses. More like a test drive for me, of a new car versus sticking with the clunky old car I have. I am going to stay with the old car.

2 Likes

I wasn’t trying to convince you to switch. Actually I think we are on the same page.

1 Like

Well, yes and no; you will have a set of results and they refer to a single G5 transmitter+receiver with a “one and a half” G5 sensors (7 days from one, 3 from the next) and a single G6 transmitter+receiver with a single G6 sensor. You are correct that you won’t be able to distinguish differences as a result of sensor variation from differences as a result of the whole system variation from just that data.

But that isn’t true of what @Eric did. He had multiple sensors. I couldn’t see any discontinuity at the point (I guess around reading 63) where he changed sensors; here’s the graph by (ordinal) reading (again):

The trendlines cross over, briefly, around the sensor change but then they had met up before around reading 22, which I guess is about 3 days in and they meet again at the end of the second sensors 10 days, around reading 120.

The “G6 reads higher” theme seems to have been shown across both pairs of sensors.

Anyway, you will have a third set of data and this time it’s pretty likely you will be using sensors from different lots (Eric’s were probably from the same lots). Likely as not you will probably also be using a different rev of the G6 transmitter and this is where the algorithms that matter are.

Hum, I guess I should really split the stats across across the two pairs of sensors… @Eric; do you know which reading is the last from the first pair?

Yes, I understand. I just wanted to clarify my thoughts on it for the benefit of anyone thinking about switching and looking at this thread.

Ultimately it’s best to try it out. But not everyone may have that opportunity. I think going from the G5 to the G6 is easy from an insurance perspective. But going backward might be harder.

I probably know the answer but need to make sure I understand the question. :grinning:

You are asking when I went from sensor 1 to sensor 2?

Yes, where “when” means the relevant entry in the list you posted here: