Discussion:
about ECDF display in ggplot2
Bogdan Tanasa
2018-07-06 23:37:37 UTC
Permalink
Dear all,

I would appreciate having your advice/suggestions/comments on the following
:

1 -- starting from a vector that contains LENGTHS (numerically, the values
are from 1 to 10 000)

2 -- shall I display the ECDF by using the R code and some "limits" :

BREAKS = c(0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400, 500,
1000, 10000, 100000, 1000000, 10000000, 100000000, 1000000000)

ggplot(x, aes(LENGTH)) +
stat_ecdf(geom = "point") +
scale_x_continuous(name = "LENGTH of DEL",
breaks = BREAKS,
limits=c(0, 500))

3 -- I am getting the following warning message : "Warning message: Removed
109 rows containing non-finite values (stat_ecdf)."

The question is : are these 109 values removed from VISUALIZATION as i set
up the "limits", or are these 109 values removed from statistical
CALCULATION?

4 -- in contrast, shall I use the standard R functions plot(ecdf), there is
no "warning mesage"

plot(ecdf(x$LENGTH), xlab="DEL LENGTH",
ylab="Fraction of DEL", main="DEL", xlim=c(0,500),
col = "dark red")

Thanks a lot !

-- bogdan
--
--
You received this message because you are subscribed to the ggplot2 mailing list.
Please provide a reproducible example: https://github.com/hadley/devtools/wiki/Reproducibility

To post: email ***@googlegroups.com
To unsubscribe: email ggplot2+***@googlegroups.com
More options: http://groups.google.com/group/ggplot2

---
You received this message because you are subscribed to the Google Groups "ggplot2" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ggplot2+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Edson Lira
2018-07-07 02:20:10 UTC
Permalink
Check yours data: outlyers, format, NA's
Post by Bogdan Tanasa
Dear all,
I would appreciate having your advice/suggestions/comments on the
1 -- starting from a vector that contains LENGTHS (numerically, the values
are from 1 to 10 000)
BREAKS = c(0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400, 500,
1000, 10000, 100000, 1000000, 10000000, 100000000, 1000000000)
ggplot(x, aes(LENGTH)) +
stat_ecdf(geom = "point") +
scale_x_continuous(name = "LENGTH of DEL",
breaks = BREAKS,
limits=c(0, 500))
Removed 109 rows containing non-finite values (stat_ecdf)."
The question is : are these 109 values removed from VISUALIZATION as i set
up the "limits", or are these 109 values removed from statistical
CALCULATION?
4 -- in contrast, shall I use the standard R functions plot(ecdf), there
is no "warning mesage"
plot(ecdf(x$LENGTH), xlab="DEL LENGTH",
ylab="Fraction of DEL", main="DEL", xlim=c(0,500),
col = "dark red")
Thanks a lot !
-- bogdan
--
--
You received this message because you are subscribed to the ggplot2 mailing list.
https://github.com/hadley/devtools/wiki/Reproducibility
More options: http://groups.google.com/group/ggplot2
---
You received this message because you are subscribed to the Google Groups "ggplot2" group.
To unsubscribe from this group and stop receiving emails from it, send an
For more options, visit https://groups.google.com/d/optout.
--
--
You received this message because you are subscribed to the ggplot2 mailing list.
Please provide a reproducible example: https://github.com/hadley/devtools/wiki/Reproducibility

To post: email ***@googlegroups.com
To unsubscribe: email ggplot2+***@googlegroups.com
More options: http://groups.google.com/group/ggplot2

---
You received this message because you are subscribed to the Google Groups "ggplot2" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ggplot2+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Bogdan Tanasa
2018-07-09 00:49:27 UTC
Permalink
Dear all,

following up on my previous email regarding ECDF in ggplot2, in order to be
more descriptive/comprehensive, 've attached to my email the following
files (as I do not have a web server running at this moment ... I would
welcome your suggestions on how to post these files online) :

1 -- the R script (R_script_display_ECDF.R) that reads the file "LENGTH"
and outputs ECDF figure by using the standard R function or ggplot2.

2 -- the display of ECDF by using standard R function
("display.R.ecdf.LENGTH.pdf")

3 -- the display of ECDF by using ggplot2 ("display.ggplot2.ecdf.LENGTH.
pdf")

The ECDF over xlim(0,500) looks very different (contrasting plot(ecdf) vs
ggplot2). Please would you advise why ? what shall I change in my ggplot2
code ?

thanks a lot,

- bogdan

ps : the R code is also written below :

library("ggplot2")
file <- read.delim("LENGTH", sep="\t", header=T, stringsAsFactors=F)
pdf("display.R.ecdf.LENGTH.pdf", width=10, height=6, paper='special')
plot(ecdf(file$LENGTH), xlab="DEL SIZE",
ylab="fraction of DEL",
main="LENGTH of DEL",
xlim=c(0,500),
col = "dark red", axes = FALSE)
ticks_y <- c(0, 0.2, 0.4, 0.6, 0.8, 1, 1.2, 1.4)
axis(2, at=ticks_y, labels=ticks_y, col.axis="red")
ticks_x <- c(0, 100, 200, 400, 500, 600, 700, 800)
axis(1, at=ticks_x, labels=ticks_x, col.axis="blue")
dev.off()
BREAKS = c(0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400, 500,
1000, 10000, 100000, 1000000, 10000000, 100000000, 1000000000)
barfill <- "#4271AE"
barlines <- "#1F3552"
pdf("display.ggplot2.ecdf.LENGTH.pdf", width=10, height=6,
paper='special')
ggplot(file, aes(LENGTH)) +
stat_ecdf(geom = "point", colour = barlines, fill = barfill) +
scale_x_continuous(name = "LENGTH of DEL",
breaks = BREAKS,
limits=c(0, 500)) +
scale_y_continuous(name = "FRACTION") +
ggtitle("ECDF of LENGTH") +
theme_bw() +
theme(legend.position = "bottom", legend.direction =
"horizontal",
legend.box = "horizontal",
legend.key.size = unit(1, "cm"),
axis.title = element_text(size = 12),
legend.text = element_text(size = 9),
legend.title=element_text(face = "bold", size = 9))
dev.off()
<https://groups.google.com/forum/?utm_source=digest&utm_medium=email#!forum/ggplot2/topics> Google
Groups
<https://groups.google.com/forum/?utm_source=digest&utm_medium=email/#!overview>
<https://groups.google.com/forum/?utm_source=digest&utm_medium=email/#!overview>
Topic digest
View all topics
<https://groups.google.com/forum/?utm_source=digest&utm_medium=email#!forum/ggplot2/topics>
- about ECDF display in ggplot2
<#m_-7344921586430841409_group_thread_0> - 2 Updates
about ECDF display in ggplot2
<http://groups.google.com/group/ggplot2/t/a99ce2e3708eee76?utm_source=digest&utm_medium=email>
Dear all,
I would appreciate having your advice/suggestions/comments on the following
1 -- starting from a vector that contains LENGTHS (numerically, the values
are from 1 to 10 000)
BREAKS = c(0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400, 500,
1000, 10000, 100000, 1000000, 10000000, 100000000, 1000000000)
ggplot(x, aes(LENGTH)) +
stat_ecdf(geom = "point") +
scale_x_continuous(name = "LENGTH of DEL",
breaks = BREAKS,
limits=c(0, 500))
3 -- I am getting the following warning message : "Warning message: Removed
109 rows containing non-finite values (stat_ecdf)."
The question is : are these 109 values removed from VISUALIZATION as i set
up the "limits", or are these 109 values removed from statistical
CALCULATION?
4 -- in contrast, shall I use the standard R functions plot(ecdf), there is
no "warning mesage"
plot(ecdf(x$LENGTH), xlab="DEL LENGTH",
ylab="Fraction of DEL", main="DEL", xlim=c(0,500),
col = "dark red")
Thanks a lot !
-- bogdan
Check yours data: outlyers, format, NA's
Back to top <#m_-7344921586430841409_digest_top>
You received this digest because you're subscribed to updates for this
group. You can change your settings on the group membership page
<https://groups.google.com/forum/?utm_source=digest&utm_medium=email#!forum/ggplot2/join>
.
To unsubscribe from this group and stop receiving emails from it send an
--
--
You received this message because you are subscribed to the ggplot2 mailing list.
Please provide a reproducible example: https://github.com/hadley/devtools/wiki/Reproducibility

To post: email ***@googlegroups.com
To unsubscribe: email ggplot2+***@googlegroups.com
More options: http://groups.google.com/group/ggplot2

---
You received this message because you are subscribed to the Google Groups "ggplot2" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ggplot2+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Hadley Wickham
2018-07-09 02:23:51 UTC
Permalink
I would highly recommend that you create a reprex, see
https://www.tidyverse.org/help/ for some details.
Hadley
Post by Bogdan Tanasa
Dear all,
following up on my previous email regarding ECDF in ggplot2, in order to
be more descriptive/comprehensive, 've attached to my email the following
files (as I do not have a web server running at this moment ... I would
1 -- the R script (R_script_display_ECDF.R) that reads the file "LENGTH"
and outputs ECDF figure by using the standard R function or ggplot2.
2 -- the display of ECDF by using standard R function
("display.R.ecdf.LENGTH.pdf")
3 -- the display of ECDF by using ggplot2 ("display.ggplot2.ecdf.LENGTH.
pdf")
The ECDF over xlim(0,500) looks very different (contrasting plot(ecdf) vs
ggplot2). Please would you advise why ? what shall I change in my ggplot2
code ?
thanks a lot,
- bogdan
library("ggplot2")
file <- read.delim("LENGTH", sep="\t", header=T, stringsAsFactors=F)
pdf("display.R.ecdf.LENGTH.pdf", width=10, height=6, paper='special')
plot(ecdf(file$LENGTH), xlab="DEL SIZE",
ylab="fraction of DEL",
main="LENGTH of DEL",
xlim=c(0,500),
col = "dark red", axes = FALSE)
ticks_y <- c(0, 0.2, 0.4, 0.6, 0.8, 1, 1.2, 1.4)
axis(2, at=ticks_y, labels=ticks_y, col.axis="red")
ticks_x <- c(0, 100, 200, 400, 500, 600, 700, 800)
axis(1, at=ticks_x, labels=ticks_x, col.axis="blue")
dev.off()
BREAKS = c(0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400, 500,
1000, 10000, 100000, 1000000, 10000000, 100000000, 1000000000)
barfill <- "#4271AE"
barlines <- "#1F3552"
pdf("display.ggplot2.ecdf.LENGTH.pdf", width=10, height=6,
paper='special')
ggplot(file, aes(LENGTH)) +
stat_ecdf(geom = "point", colour = barlines, fill = barfill) +
scale_x_continuous(name = "LENGTH of DEL",
breaks = BREAKS,
limits=c(0, 500)) +
scale_y_continuous(name = "FRACTION") +
ggtitle("ECDF of LENGTH") +
theme_bw() +
theme(legend.position = "bottom", legend.direction =
"horizontal",
legend.box = "horizontal",
legend.key.size = unit(1, "cm"),
axis.title = element_text(size = 12),
legend.text = element_text(size = 9),
legend.title=element_text(face = "bold", size = 9))
dev.off()
<https://groups.google.com/forum/?utm_source=digest&utm_medium=email#!forum/ggplot2/topics> Google
Groups
<https://groups.google.com/forum/?utm_source=digest&utm_medium=email/#!overview>
<https://groups.google.com/forum/?utm_source=digest&utm_medium=email/#!overview>
Topic digest
View all topics
<https://groups.google.com/forum/?utm_source=digest&utm_medium=email#!forum/ggplot2/topics>
- about ECDF display in ggplot2
<#m_-280299680146272914_m_-7344921586430841409_group_thread_0> - 2
Updates
about ECDF display in ggplot2
<http://groups.google.com/group/ggplot2/t/a99ce2e3708eee76?utm_source=digest&utm_medium=email>
Dear all,
I would appreciate having your advice/suggestions/comments on the following
1 -- starting from a vector that contains LENGTHS (numerically, the values
are from 1 to 10 000)
BREAKS = c(0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400, 500,
1000, 10000, 100000, 1000000, 10000000, 100000000, 1000000000)
ggplot(x, aes(LENGTH)) +
stat_ecdf(geom = "point") +
scale_x_continuous(name = "LENGTH of DEL",
breaks = BREAKS,
limits=c(0, 500))
3 -- I am getting the following warning message : "Warning message: Removed
109 rows containing non-finite values (stat_ecdf)."
The question is : are these 109 values removed from VISUALIZATION as i set
up the "limits", or are these 109 values removed from statistical
CALCULATION?
4 -- in contrast, shall I use the standard R functions plot(ecdf), there is
no "warning mesage"
plot(ecdf(x$LENGTH), xlab="DEL LENGTH",
ylab="Fraction of DEL", main="DEL", xlim=c(0,500),
col = "dark red")
Thanks a lot !
-- bogdan
Check yours data: outlyers, format, NA's
Back to top <#m_-280299680146272914_m_-7344921586430841409_digest_top>
You received this digest because you're subscribed to updates for this
group. You can change your settings on the group membership page
<https://groups.google.com/forum/?utm_source=digest&utm_medium=email#!forum/ggplot2/join>
.
To unsubscribe from this group and stop receiving emails from it send an
--
--
You received this message because you are subscribed to the ggplot2 mailing list.
Please provide a reproducible example: https://github.com/hadley/
devtools/wiki/Reproducibility
More options: http://groups.google.com/group/ggplot2
---
You received this message because you are subscribed to the Google Groups "ggplot2" group.
To unsubscribe from this group and stop receiving emails from it, send an
For more options, visit https://groups.google.com/d/optout.
--
http://hadley.nz
--
--
You received this message because you are subscribed to the ggplot2 mailing list.
Please provide a reproducible example: https://github.com/hadley/devtools/wiki/Reproducibility

To post: email ***@googlegroups.com
To unsubscribe: email ggplot2+***@googlegroups.com
More options: http://groups.google.com/group/ggplot2

---
You received this message because you are subscribed to the Google Groups "ggplot2" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ggplot2+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Bogdan Tanasa
2018-07-09 05:39:14 UTC
Permalink
Dear Hadley,

thank you for your message : great to her from you. I can see the input and
output files available at :
https://groups.google.com/forum/#!topic/ggplot2/qZzi43CO7nY

nevertheless, I aimed to build a reprex repository, and here the folder
attached (hope it is helpful ... not very sure if all the files inside the
folder are at the best standards '). thanks a lot for your help,

-- bogdan
Post by Hadley Wickham
I would highly recommend that you create a reprex, see
https://www.tidyverse.org/help/ for some details.
Hadley
Post by Bogdan Tanasa
Dear all,
following up on my previous email regarding ECDF in ggplot2, in order to
be more descriptive/comprehensive, 've attached to my email the following
files (as I do not have a web server running at this moment ... I would
1 -- the R script (R_script_display_ECDF.R) that reads the file "LENGTH"
and outputs ECDF figure by using the standard R function or ggplot2.
2 -- the display of ECDF by using standard R function
("display.R.ecdf.LENGTH.pdf")
3 -- the display of ECDF by using ggplot2 ("display.ggplot2.ecdf.LENGTH.
pdf")
The ECDF over xlim(0,500) looks very different (contrasting plot(ecdf) vs
ggplot2). Please would you advise why ? what shall I change in my ggplot2
code ?
thanks a lot,
- bogdan
library("ggplot2")
file <- read.delim("LENGTH", sep="\t", header=T, stringsAsFactors=F)
pdf("display.R.ecdf.LENGTH.pdf", width=10, height=6, paper='special')
plot(ecdf(file$LENGTH), xlab="DEL SIZE",
ylab="fraction of DEL",
main="LENGTH of DEL",
xlim=c(0,500),
col = "dark red", axes = FALSE)
ticks_y <- c(0, 0.2, 0.4, 0.6, 0.8, 1, 1.2, 1.4)
axis(2, at=ticks_y, labels=ticks_y, col.axis="red")
ticks_x <- c(0, 100, 200, 400, 500, 600, 700, 800)
axis(1, at=ticks_x, labels=ticks_x, col.axis="blue")
dev.off()
BREAKS = c(0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400, 500,
1000, 10000, 100000, 1000000, 10000000, 100000000, 1000000000)
barfill <- "#4271AE"
barlines <- "#1F3552"
pdf("display.ggplot2.ecdf.LENGTH.pdf", width=10, height=6,
paper='special')
ggplot(file, aes(LENGTH)) +
stat_ecdf(geom = "point", colour = barlines, fill = barfill) +
scale_x_continuous(name = "LENGTH of DEL",
breaks = BREAKS,
limits=c(0, 500)) +
scale_y_continuous(name = "FRACTION") +
ggtitle("ECDF of LENGTH") +
theme_bw() +
theme(legend.position = "bottom", legend.direction =
"horizontal",
legend.box = "horizontal",
legend.key.size = unit(1, "cm"),
axis.title = element_text(size = 12),
legend.text = element_text(size = 9),
legend.title=element_text(face = "bold", size = 9))
dev.off()
<https://groups.google.com/forum/?utm_source=digest&utm_medium=email#!forum/ggplot2/topics> Google
Groups
<https://groups.google.com/forum/?utm_source=digest&utm_medium=email/#!overview>
<https://groups.google.com/forum/?utm_source=digest&utm_medium=email/#!overview>
Topic digest
View all topics
<https://groups.google.com/forum/?utm_source=digest&utm_medium=email#!forum/ggplot2/topics>
- about ECDF display in ggplot2
<#m_-4302268377931722955_m_-280299680146272914_m_-7344921586430841409_group_thread_0>
- 2 Updates
about ECDF display in ggplot2
<http://groups.google.com/group/ggplot2/t/a99ce2e3708eee76?utm_source=digest&utm_medium=email>
Dear all,
I would appreciate having your advice/suggestions/comments on the following
1 -- starting from a vector that contains LENGTHS (numerically, the values
are from 1 to 10 000)
BREAKS = c(0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400, 500,
1000, 10000, 100000, 1000000, 10000000, 100000000, 1000000000)
ggplot(x, aes(LENGTH)) +
stat_ecdf(geom = "point") +
scale_x_continuous(name = "LENGTH of DEL",
breaks = BREAKS,
limits=c(0, 500))
3 -- I am getting the following warning message : "Warning message: Removed
109 rows containing non-finite values (stat_ecdf)."
The question is : are these 109 values removed from VISUALIZATION as i set
up the "limits", or are these 109 values removed from statistical
CALCULATION?
4 -- in contrast, shall I use the standard R functions plot(ecdf), there is
no "warning mesage"
plot(ecdf(x$LENGTH), xlab="DEL LENGTH",
ylab="Fraction of DEL", main="DEL", xlim=c(0,500),
col = "dark red")
Thanks a lot !
-- bogdan
Check yours data: outlyers, format, NA's
Back to top
<#m_-4302268377931722955_m_-280299680146272914_m_-7344921586430841409_digest_top>
You received this digest because you're subscribed to updates for this
group. You can change your settings on the group membership page
<https://groups.google.com/forum/?utm_source=digest&utm_medium=email#!forum/ggplot2/join>
.
To unsubscribe from this group and stop receiving emails from it send an
--
--
You received this message because you are subscribed to the ggplot2 mailing list.
Please provide a reproducible example: https://github.com/hadley/devt
ools/wiki/Reproducibility
More options: http://groups.google.com/group/ggplot2
---
You received this message because you are subscribed to the Google Groups
"ggplot2" group.
To unsubscribe from this group and stop receiving emails from it, send an
For more options, visit https://groups.google.com/d/optout.
--
http://hadley.nz
--
--
You received this message because you are subscribed to the ggplot2 mailing list.
Please provide a reproducible example: https://github.com/hadley/devtools/wiki/Reproducibility

To post: email ***@googlegroups.com
To unsubscribe: email ggplot2+***@googlegroups.com
More options: http://groups.google.com/group/ggplot2

---
You received this message because you are subscribed to the Google Groups "ggplot2" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ggplot2+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Loading...