Then I learned R and have even more reason to avoid Excel.
Recently I needed to implement date based cohorts in SiteCatalyst. While there are a few blog posts on how to do this in Excel using Report Builder (http://adam.webanalyticsdemystified.com/2013/03/07/conducting-cohort-analysis-with-adobe-sitecatalyst/ , http://blogs.adobe.com/digitalmarketing/mobile/what-is-mobile-cohort-analysis-and-why-should-you-use-it/) they didn't work for me. My team is all on MacOS, and Report Builder isn't.
In this example I'm going to use events tracked by the Mobile Library lifecycle stats. One plus of this solution is it doesn't require any SAINT classifiers to convert mobileinstalldate to a month/year.
The idea here is you use QueueTrended to chunk together uniqueusers by month, with mobileinstalldate as the counted event. If you look at the data output from QueueTrended is makes more sense. The rest is then using plyr and reshape2 to beat the data into the form we want. It works just fine with segments.
I'm not sharing my code that generates percentages yet because I'm not particularly happy with it yet. Drop me a line if you are interested.
And yes, the data is small, this is from a private unreleased product I am working on.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
> print(cohort_chart) | |
startmonth 0 1 2 3 4 | |
1 Oct 2013 9 1 0 0 0 | |
2 Nov 2013 27 13 12 10 0 | |
3 Dec 2013 35 14 9 0 0 | |
4 Jan 2014 45 28 0 0 0 | |
5 Feb 2014 29 0 0 0 0 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# needed to access SiteCatalyst programatically | |
require(RSiteCatalyst) || install.packages("RSiteCatalyst") | |
library(RSiteCatalyst) | |
# has a handy yearmon class for dealing with months | |
require(zoo) || install.packages("zoo") | |
library(zoo) | |
library(reshape2) | |
library(plyr) | |
#Call SCAuth function to set credentials before usage | |
SCAuth("your creds", "your secret") | |
MonthCohortChart <- function(trended_data){ | |
# my apologies, I'm new at R, there is probably | |
# a more elegant way of doing this | |
# first off, create a column for start day by month | |
f = ddply(trended_data, | |
.(year, month, day), | |
transform, | |
startmonth = as.yearmon(mobileinstalldate, "%m/%d/%Y") ) | |
# then create one for the bucket month | |
j = ddply(f, | |
.(year, month, day), | |
transform, | |
bucketmonth = as.yearmon(ISOdate(year, month, day)) ) | |
# lastly, calculate the cohort month | |
k = ddply(j, | |
.(year, month, day), | |
transform, | |
cohortmonth = round((bucketmonth - startmonth) * 12 ) ) | |
# now remove all the negative cohort months as they are meaningless | |
#todo, clear the diagonals that don't matter instead of leaving them as 0 | |
k = subset(k, cohortmonth >= 0) | |
# reshape into mobileinstalldate and name (month) | |
# sum together the number of unique users in each bucket | |
molten_form <- melt( k, | |
id=c("startmonth", "cohortmonth"), | |
measure.vars=c("uniquevisitors")) | |
final_output <- dcast(molten_form, | |
fun.aggregate=sum, | |
startmonth ~ cohortmonth) | |
} | |
uniques_by_month_cohorted <- | |
QueueTrended("amhammersmith-extpre", | |
"2013-10-01", | |
"2014-02-28", | |
"month", | |
"uniquevisitors", | |
"mobileinstalldate", | |
top="100000", | |
startingWith= "1", | |
segment_id="") | |
cohort_chart <- MonthCohortChart(uniques_by_month_cohorted) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
> print(uniques_by_month_cohorted) | |
mobileinstalldate name year month day uniquevisitors uniquevisitors_forselectedelements segment | |
1 12/19/2013 October 2013 2013 10 1 0 12 | |
2 12/19/2013 November 2013 2013 11 1 0 41 | |
3 12/19/2013 December 2013 2013 12 1 7 54 | |
4 12/19/2013 January 2014 2014 1 1 2 74 | |
5 12/19/2013 February 2014 2014 2 1 2 86 | |
6 1/14/2014 October 2013 2013 10 1 0 12 | |
7 1/14/2014 November 2013 2013 11 1 0 41 | |
8 1/14/2014 December 2013 2013 12 1 0 54 | |
9 1/14/2014 January 2014 2014 1 1 6 74 | |
10 1/14/2014 February 2014 2014 2 1 4 86 |