NLP – DXC leadership conference CTO Keynote

Initial setup

created a set of nodes for each transcript

CREATE (n1:transcript { name: 'MLPAnalytics', title: 'Thriving on Enterprise Data and Analytics'})
CREATE (n2:transcript { name: 'MLPApps', title: 'Agile Applications and Digital Experiences' })
CREATE (n3:transcript { name: 'MLPBanking', title: 'Financial Inclusion: Key to the Future of Banking' })
CREATE (n4:transcript { name: 'MLPCloud', title: 'Hybrid Cloud Platforms Enable the Digital Enterprise' })
CREATE (n5:transcript { name: 'MLPHalthcare', title: 'The Shift to Digital Health and the Era of Healthcare 3.0' })
CREATE (n6:transcript { name: 'MLPInsurance', title: 'Key Shifts Mark the Path to Digital Insurance' })
CREATE (n7:transcript { name: 'MLPRisk', title: 'A Hyperconnected World Triggers New Forms of Enterprise Risk' })
CREATE (n8:transcript { name: 'MLPWorkplace', title: 'Empowering Workforces with Invisible IT' })
CREATE (n9:transcript { name: 'MLPOverview', title: 'Transforming to a Digital Enterprise' })
CREATE (n11:transcript { name: 'DanLeaders17', title: 'DXC TechTalk April 2017' })
CREATE (n12:transcript { name: 'DXCTechTalkApril2017', title: 'DXC TechTalk April 2017' })
CREATE (n13:transcript { name: 'DXCTechTalkMay2017', title: 'DXC TechTalk April 2017' });

 

Set a constraint on the paper and word nodes, this also sets an index on this key fields

CREATE CONSTRAINT ON (w:word) ASSERT w.name IS UNIQUE;
CREATE CONSTRAINT ON (t:transcript ) ASSERT t.name IS UNIQUE;

 

WITH split(tolower("TEXT GOES HERE"), " ") AS words
WITH [w in words WHERE NOT w IN ["STOP", "WORDS","GO","HERE"]] AS text 
UNWIND range (0,size(text)-2)as i
MERGE (w1:Word {name: text[i]})
 ON CREATE SET w1.count = 1 ON MATCH SET w1.count = w1.count +1
MERGE (w2:Word {name: text[i+1]})
 ON CREATE SET w2.count = 1 ON MATCH SET w2.count = w2.count +1
MERGE (w1)-[r:NEXT]->(w2)
 ON CREATE SET r.count = 1
 ON MATCH SET r.count = r.count+1
//Create a relationship to the paper node (assumes the node exists)
WITH w1,w2
Match (p:paper)
WHERE p.name="DanLeaders"
MERGE (p)-[r:INCLUDED]->(w1)
MERGE (p)-[r2:INCLUDED]->(w2);

Output and queriesimport script

top words

match (p:paper{name:"DanLeaders"})--(w:Word)
return p.name, w.name, w.count
order by w.count desc

Correction to the above query, as the counter is returning the overall count of the word (i.e. count across all imported transcripts), the import query counts both the overall occurrence of the word and also it’s occurrence within the paper.   so the correct query is

match (t:transcript{name:'DanLeaders'})-[r:INCLUDED]-(w:Word) return t.name as Transcript, w.name as Word, r.count as Count order by r.count desc

{“count”:32,”name”:”digital”}{“count”:44,”name”:”clients”}

{“count”:22,”name”:”dxc”}

{“count”:22,”name”:”business”}

{“count”:22,”name”:”new”}

{“count”:22,”name”:”information”}

{“count”:18,”name”:”technology”}

{“count”:18,”name”:”key”}

{“count”:18,”name”:”lef”}

{“count”:16,”name”:”scale”}

presented as a graph

what is interesting here is the flow of the words within the cloud.

match (p:paper{name:"DanLeaders"})--(w:Word)
return p, w
order by w.count desc limit 25
1.png

key phrases

Understanding the grouping of common words can call out the key phrases within the text.

the range can be adapted 1..4 and 1..3  shows good results

MATCH p=(:Word)-[r:NEXT*1..3]->(:Word) WITH p
WITH reduce(s=0,x IN relationships(p) | s + x.count) AS total, p
WITH nodes(p) AS text, 1.0*total/size(nodes(p)) as weight
RETURN extract(x IN text | x.name) AS phrase, weight ORDER BY weight DESC LIMIT 10

 

[“leading”,”edge”]           1.5
[“bring”,”information”,”technology”]      1.333333333
[“information”,”technology”,”community”]         1.3333
[“market”,”leading”,”edge”]       1.3333333333
[“standard”,”offers”,”designed”]               1.333333333
[“leaders”,”leading”,”edge”]       1.3333333333
[“leading”,”edge”,”portfolio”]    1.333333333
[“key”,”issues”,”minds”]               1.33333333
[“thinking”,”digital”,”journey”]  1
[“digital”,”delivery”,”platform”] 1

Further key word analysis

Value capture is a key term within DXC right now, what was the context of value from the leadership keynote?2.png

I think more creation, than capture.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s