Ich versuche, ein (ungerichtetes) soziales Netzwerk aufzubauen, das auf dem gleichzeitigen Auftreten von Individuen basiert . Der Clustering-Algorithmus wird später in diesem Netzwerk angewendet, um bestimmte Untergruppen zu finden. Das Problem ist, dass untersuchte Tierarten eine sehr kurze Lebensdauer haben (oder eher eine sehr hohe Sterblichkeit aufgrund von Raubtieren). Dies führt dazu, dass möglicherweise nicht alle Beziehungen in meinem Netzwerk gleichzeitig bestanden haben. Wenn Sie sich das folgende Diagramm ansehen, sind die "roten" Personen nach 3-4 Jahren * fast ausgestorben, aber sie haben die "längste" Zeit, um andere Personen zu "treffen" , während "blaue" Personen nur zwei Jahre Zeit haben, sich zu "treffen" " andere.
Theoretisch kann ich davon ausgehen, dass jeder Einzelne eine Lebensdauer von weniger als 10 Jahren erwartet hat. Daher bedeutet das Nichtfangen von "roten" Personen 5 oder 6 Jahre nach dem Markieren nicht unbedingt, dass sie tot sind.
Wie kann dieser Zeiteffekt in ein soziales Netzwerk aufgenommen werden?
Spezifische Fragen, die ich beantworten möchte: Erste Frage: Unterscheiden sich beobachtete soziale Verbindungen von Verbindungen, die ausschließlich durch die Nutzung des gemeinsamen Raums erklärt werden? dh wie zu testen, ob Assoziationen zufällig oder bevorzugt sind?
Wenn die Antwort auf die erste Frage lautet , dass Assoziationen zwischen Individuen NICHT zufällig sind, dann habe ich eine zweite Frage ...
Korreliert die soziale Struktur mit der genetischen Verwandtschaft? dh sind eng verwandte Personen häufiger zusammen? (DNA-Profile aller Personen sind fett)
Hier habe ich einige Daten erstellt, die meiner Datenbank strukturell ähnlich sind:
data <- data.frame(obs_date = c("C1","C2","C3","C4","C5","C6","C1","C2",
"C3","C4","C1","C2","C3","C1","C2","C3",
"C4","C5","C6","C7","C1","C3","C4","C5",
"C6","C7","C8","C3","C4","C5","C6","C7",
"C3","C4","C5","C6","C3","C4","C5","C3",
"C4","C5","C6","C5","C6","C7","C8","C5",
"C5","C6","C7","C8","C5","C6","C7","C7",
"C7","C8","C7","C8","C7","C8","C7","C8"),
ind_id = rep(LETTERS[1:20], times = c(6,4,3,7,1,6,5,4,
3,2,2,4,1,4,3,1,2,2,2,2)),
obs = rep(c("seen","not_seen","seen","not_seen","seen",
"not_seen","seen","not_seen","seen"),
times = c(3,1,4,1,9,1,9,3,33)))
Hier habe ich genetische Struktur hinzugefügt. Die Daten sind vollständig fabriziert, sollten jedoch eine enge genetische Verwandtschaft zwischen denselben Collor-Individuen widerspiegeln. Zusätzlich sind "violette" Individuen Nachkommen von "blau" , "blau" sind Nachkommen von "grün" , "grün" sind Nachkommen von "rot" .
gen.raw <- matrix(c("a","g","g","g","c","g","a","a","g","g","g","g","t","c","t","c","t","t","a","a","t","t","a","a",
"a","g","g","g","c","g","a","a","g","g","g","g","c","c","t","c","t","t","a","a","t","c","a","a",
"a","g","g","g","c","g","g","a","g","g","g","g","c","c","t","t","c","t","a","a","t","c","a","a",
"a","g","t","t","t","g","g","a","g","g","g","g","c","c","t","t","c","t","a","a","a","c","a","a",
"a","g","t","t","t","g","g","a","g","g","g","g","c","c","t","t","c","t","t","a","a","c","a","a",
"a","g","t","t","t","g","g","a","g","g","g","g","c","c","t","t","c","t","t","a","a","c","a","a",
"a","g","t","t","t","g","g","g","g","g","c","g","c","c","t","t","c","t","t","a","a","c","a","a",
"a","g","t","t","t","g","a","c","g","t","c","g","c","c","t","t","c","t","t","a","a","c","a","a",
"a","g","t","t","t","g","a","c","g","t","c","g","c","c","t","t","c","t","t","a","a","c","a","a",
"a","g","t","t","t","g","a","c","g","t","c","g","c","c","t","t","c","t","t","a","a","c","a","a",
"a","g","t","t","t","g","a","c","g","t","c","g","c","c","t","t","c","t","t","a","a","c","a","a",
"a","g","t","t","t","g","a","c","g","t","c","g","c","c","t","t","c","t","t","a","a","c","a","a",
"a","g","t","t","t","g","a","c","g","t","c","g","c","c","t","t","c","t","t","a","a","c","a","a",
"a","g","t","t","t","g","a","c","g","t","c","g","c","c","t","t","c","t","t","a","t","c","a","a",
"a","g","t","t","t","g","a","c","g","t","c","g","c","c","t","t","c","t","t","a","t","c","a","a",
"a","g","t","t","t","g","a","c","g","t","c","g","c","c","t","t","c","t","t","a","t","c","a","a",
"a","g","t","c","t","g","a","c","g","g","c","g","c","c","t","t","c","t","t","a","t","c","a","a",
"a","g","t","c","t","g","a","c","g","g","c","g","c","c","t","t","c","t","t","a","t","c","a","a",
"a","g","t","c","t","g","a","c","g","g","c","g","c","c","t","t","c","t","t","a","t","c","a","a",
"a","g","t","c","t","g","a","c","g","c","c","g","t","c","t","t","c","t","t","a","t","c","a","a"),
byrow = TRUE, ncol = 24)
rownames(gen.raw) <- LETTERS[1:20]
Ok, die Quelldaten sind oben angegeben. Jetzt werde ich zwei Distanzmatrizen erstellen . Erstens ist die Assoziationsmatrix aus Koexistenzdaten abgeleitet, die durch den OR-SP-Index dargestellt werden . Der beobachtete Roost-Sharing-Anteil wird für jedes Personenpaar berechnet, indem die Anzahl der Tage, an denen zwei Personen zusammen gefunden wurden, durch die Anzahl aller möglichen Tage geteilt wird, an denen sie zusammen sein könnten (Überlappung zwischen der ersten und der letzten Aufzeichnung beider Personen).
# matrix of days roosting together
EG <- expand.grid(unique(data$ind_id), unique(data$ind_id))
data_seen <- subset(data, obs == "seen")
my.length.dt <- numeric(nrow(EG))
for (i in 1:nrow(EG)) {
my.length.dt[i] <- length(intersect(as.vector(data_seen$obs_date[data_seen$ind_id == EG[i, 1]]),
as.vector(data_seen$obs_date[data_seen$ind_id == EG[i, 2]])))
days.together <- matrix(my.length.dt, byrow = TRUE, ncol = length(unique(data$ind_id)))
colnames(days.together) <- rownames(days.together) <- unique(data$ind_id)
}
days.together
# matrix of all possible potentional roosting days
EG <- expand.grid(unique(data$ind_id), unique(data$ind_id))
my.length.rdp <- numeric(nrow(EG))
for (i in 1:nrow(EG)) {
my.length.rdp[i] <- length(intersect(as.vector(data$obs_date[data$ind_id == EG[i, 1]]),
as.vector(data$obs_date[data$ind_id == EG[i, 2]])))
roosting_days_possible <- matrix(my.length.rdp, byrow = TRUE, ncol = length(unique(data$ind_id)))
colnames(roosting_days_possible) <- rownames(roosting_days_possible) <- unique(data$ind_id)
}
roosting_days_possible
# OBSERVED ROOST-SHARING PROPORTION
OSP <- days.together/roosting_days_possible
OSP[ is.nan(OSP) ] <- 0
diag(OSP) <- 0
# So here is association matrix derived from co-occurence data
round(OSP,2)
# social distance matrix
soc_dist <- as.dist(OSP)
Der nächste Schritt besteht darin, DNA-Sequenzen zu nehmen und eine Matrix für genetische Verwandtschaft zu erstellen
# creating matrix of relatedness
library(ape)
gen.str <- as.DNAbin(gen.raw)
my.gen.dist <- dist.dna(gen.str)
fit <- hclust(my.gen.dist, method="ward")
plot(fit) # display dendogram
Schließlich vergleiche ich hier die soziale Distanz mit der genetischen Distanz nach dem Mantel-Test .
library(ade4)
mantel.rtest(soc_dist, my.gen.dist, nrepet = 9999)
Bedeutet das Ergebnis (p> 0,05), dass kein Zusammenhang zwischen sozialer und genetischer Struktur besteht?
Ist diese Lösung geeignet, um meine Frage zu beantworten? Irgendwelche Ideen?
Ich fand auch, dass für die soziale Struktur diese Art von Grafik besser sein könnte als Dendrogramm. Gut, um eine bestimmte soziale Gruppe zu finden.
# Show social structure
library(igraph)
g <- graph.adjacency(OSP, weighted=TRUE, mode ="undirected")
g <- simplify(g)
# set labels and degrees of vertices
V(g)$label <- V(g)$name
V(g)$degree <- degree(g)
wc <- walktrap.community(g)
plot(wc, g)