Facebook Graph: Page/Album Edge Recap
facebook-graph
python
automation
etl
wwe
]
In the previous post, the
Page/Albums Edge allowed us to
get a list of all Album nodes associated with a given
Page Node. From this, we
created a simple page_id/album_id mapping table (called page_album_map
). Now, if your only
working with a single Facebook Page, then this table might
seem kind of silly. But! The second you begin working on a second Facebook Page,
it starts to take on some meaning. If you work for something like a record label, publishing company, or sports empire,
then it’s likely you’re tracking tens to hundreds of Facebook Pages, easy. No question here: the mapping
table becomes pretty dang important!
We also found that it was possible to return each Album’s fields from the Page/Albums Edge, e.g.:
# Assume token object has been created that allows such behavior
fields = ','.join([
'id', 'name', 'type', 'cover_photo',
'description', 'event', 'link', 'place',
'privacy', 'created_time', 'updated_time'
])
token.get('me/albums?fields='+fields)
This is great! In other words, one needn’t return a list of album IDs in order to access each Album Node separately. How laborious!
# Laborious Way
page_albums = token.get('me/albums', paginate=True)
album_id = [album['id'] for album in page_albums['data']]
album_data = []
fields = ','.join([
'id', 'name', 'type', 'cover_photo',
'description', 'event', 'link', 'place',
'privacy', 'created_time', 'updated_time'
])
for aid in album_id:
album_data += token.get(aid+'?fields='+fields)
And so we created a table for album fields, imaginatively called album_fields
. Now, if you have tens
to hundreds of Facebook Pages, then you likely
have hundred to thousands of Facebook Albums. Good thing we have that page/album mapping table! This way,
we can easily map album field records to their parent pages:
That’s why we have the page_id/album_id mapping table:
SELECT *
FROM album_fields A
JOIN page_album_map B
ON A.album_id = B.album_id
WHERE B.page_id = '9899376497'
The question might arise: Why not just include the page_id in this table instead of requiring a separate mapping table.
Short Answer: The page_album_map and album_fields tables were kept separate because at one point in my life I heard about normalization in relational databases.
Slightly Longer Answer: In the page\_album\_map
table, each (page_id, album_id)
pair is written only once. However, in the album\_fields
table, an album_id may be written multiple
times (one row per album update). If we included the page_id field, we would waste space and allow for the
possibility of error. This might not seem like a big deal here, but below we will create album engagement
tables that get updated daily or hourly – at which point it becomes a slightly bigger deal.
Not gonna lie: we have tables in Redshift that rebel like crazy against normalization, and quite frankly, sometimes it’s just simpler that way. I’m not a database guru, so judge me not if I depart from such norms anywhere in here or the annals of this blog!
Next Up
In the next installment in this Facebook Graph series, I will cover the Album Node itself in more detail.