Hi Friends,

Even as I launch this today ( my 80th Birthday ), I realize that there is yet so much to say and do. There is just no time to look back, no time to wonder,"Will anyone read these pages?"

With regards,
Hemen Parekh
27 June 2013

Now as I approach my 90th birthday ( 27 June 2023 ) , I invite you to visit my Digital Avatar ( www.hemenparekh.ai ) – and continue chatting with me , even when I am no more here physically

Monday 21 June 2021

BIG DATA - VIKTOR MAYER-SCHONBERGER AND KENNETH CUKIER



====================================================================================

When I read any book, I scribble my comments / notes in the margins


These reflect my views / opinions about what the author is saying – including my disagreement

Often, my comments are in the nature of telling myself :

Hey ! We should try out this idea in our own business ( Head-hunting / Online Recruitment )

Following are my comments re :

BIG DATA -VIKTOR MAYER-SCHONBERGER AND KENNETH CUKIER

                                                                                                          

=====================================================================================


Pg. No. i

·         Viktor.ms@oill.ox.ac.uk

·         kn@cukier.com

·         (Fly Ontime.us)

Q = pg. 117

·         “open source” source-code for site

Pg. No. 3

·         I suppose, same logic / algorithm / statistres,, can be applied for terrorist attacks and break-out of local wars and demonstration etc.

Pg. No. 4

·         We have database of more than 2 million job-advt, spread over past 8 years, posted by 50,000 + employers & recruiters.

·         +26 (95%) / +36 (99%)

·         Thru RSS feed, we are downloading some 1500 + jobadvts DAILY, form leading Job Portal.

·         Purpose of a Job Advt = Appt. of candidate.

Pg. No. 6

·         TextMechanic.com (ree), can count frequency with which each worel appears in a text document. So, it can count “frequency of “Words” in a job advt. can it count in 2 million documents, at one go? That would help

·         Just sent mail (thru FB) to Mike

Pg. No. 11

·         Imagine what we could do, it we succeed in capturing all “search criteria” used by all jobseekers, on all job portals of the world?

·         WHO will advertise WHAT job and WHEN

Pg. No. 16

·         Alogorithmic trading

Pg. No. 17

·         Probability

·         Jobs for All

Pg. No. 19

·         If my suggested mobile app “VotesApp” were to be used for voting in all elections (Central + State + Municipal + Panchayat), then we would have huge amount of data about “Choices” exercised by 850 million voters, several times.

Pg. No. 21

·         Same is true of “Job Surveys” being conducted by job portal / recruitment Agencics / Labor Dept, once – in – six months.

Pg. No. 27

·         Our RSS feeds (form ¾ major job portals) download everyday, 1000-2000 job advt. automatically & normalize in a structured database

Pg. No. 29

·         No. of job advts released (posted on job portals) by any company, in successive quarters, could well “co-relate” with increase / decrease in Revenue.

Pg. No. 30

·         I have listed many in my “Email to Rohini”

·         Who calls whom and for how long & at what time.

Pg. No. 34

·         Even though, the structure of job advts which we are downloading from different job portals, differ, we are “Normalizing” all before storing in database.

·         Or just one wall clock in entire flat

Pg. No. 37

·         If an are job advt in our database of 2 million, cantain 100 words, then we get 200 MILLION words.

·         Even in 1967, L & T’s IBM 1401, did!

·         I wrote suchalogic for tool Harvester Guru Gem.

Pg. No. 38

·         On my Facebook page, I have posted one such “hilarious” translation of my Hindi poem! Must try out poem, I wrote to day – 01 May 2016

·         Often produces lots of laughs.

Pg. No. 39

·         See Google’s “N-gram” project

·         (“Frequency of Usage” – increasing or reducing)

·         As in Neural Network?

Pg. No. 41

·         Policy makers – and Jobseekers too, - should be able to see “Job Trends”, dynamically, on-a-day-to-day, basis; not after 3 months.

Pg. No. 42

·         The amount of data that can be captured-and analysed – it my suggested mobile app : 3D-Digital Delivery of Drugs : were to be implemented, is simply mind loggling. It will change Health care Eco-System of our contry.

·         Just sent emails to NDA Ministers & the Authors of this book

·         Pay now this must be 20 billion.

Pg. No. 45

·         When we started getting RSS feeds of job advts, some 10 years ago, I had no idea, we could someday, subject it to BIG DATA analysis

Pg. No. 46

·         In any case, some 2 million job advts (database) is in our Web Server, at one place (CustomizeEsume.com) + may be another 2 million on IndiaRecruiter.net

Pg. No. 50

·         indiaRecruiter.net was launched, may be around 2004. In that, we had a feature that told the employers, who posted their job advts:-

·         “Ltere is list of jobseekers who looked at your job advt, but did not apply. Send them email (click on name) to inquire why?”

Pg. No. 54

·         In 1970’s, I had recommended to L & T Mgmt to install computers at all Switchgear Dealers, which were connected to a central server, so that we know at all times, what products are selling at what rat and the Inventory level of each item with each dealer. 20 yrs before Walmart!!

Pg. No. 55

·         Guess

Pg. No. 56

·         With a data set of 2 million job-advt, compiled over past 8 years – and for 50, 000 employers, I am sure BIG DATA ean “predict”, WHO will advertise

·         WHAT vacancy and WHEN

·         Consumer Surplus.

Pg. No. 57

·         If (say) WIPRO, has posted 10,000 job-advts in past 8 yrs., we can tabulate these : DESIGNATION wise for 32 quarters. Frequencies (for each vacancy) will tell us shift in WIPRO’s business or clients

·         Hiring Patlerns.

Pg. No. 58

·         Frequently quted ex.

Pg. No. 59

·         Another frequently quoted example

Pg. No. 60

·         If a good “proxy” for diseases is “Medicines” purchased, then my suggested mobile app: 3D – Digital Delivery of Drugs, “could yield predictions.

·         Quiet before the storm?

Pg. No. 68

·         This example too, has been quoted earlier in some other book

Pg. No. 73

·         I have read this story too, in some other book!

Pg. No. 76

·         My database of 2 million job-advt, too is “OLD”!

Pg. No. 78

·         I belive we have.

Pg. No. 83

·         Text resumes could be searched for specific “Keywords” (as in ISYS) – but trow up thousands of records. We need to “parse” the text and create

·         Structured database (datafy), to narrow down to a few records.

Pg. No. 89

·         In late 60’s, I had designed a system for tracking L & T trucks, which went out for collecting raw-materials and components, from vendors, all over Mumbai. Each driver was given a map of Mumbai (print out) with collection locations, marked unit, showing route to be followed. He had to phone from each vendr

Pg. No. 90

·         Eg: ad of a restaurant on mobiles of persons passing outside that restaurant.

Pg. No. 91

·         Thru (www.hemenparekh.in) I am truing to preserve, my sentiments for the posterity. It is my DIGITAL AVATAR.

Pg. No. 92

·         (www.Belong.co) uses data from many social sites, to recommend suitable candidates to Employers.

Pg. No. 93

·         It must have computed the “frequency of Usage” of words (in these tweets), which describe these “Emotional States”.

·         In 2 million job advts, we have “data of posting”

·         In our “Resume Blaster” mobile app, we are able to pick up entire “CONTACT” data from user’s mobile phone. I have written a not on how this can be used to develop “parekh rank” – like “page rank”

·         (05-05-16) (Anokhee’s Birthday)

Pg. No. 95

·         Just read about (app) “Spiro call” of Shwetak Patel which can tell you about your Lung Function, by asking you to speak into any phone, 1 – 800 – Just Sent email to Shwetak got reply 12/18 month to go.

Pg. No. 96

·         For past 12 months. I have sent out several emails (to Minsters etc.), suggesting embedding RFID sensors in plastic notes of Rs. 500 / 1000 denominations – to eliminate BLACK MONEY.

Pg. No. 99

·         In trying to analize 4 million job-advts, using BIG DATA, this is what I am targeting. Once that vacany gets filled up, that job-advt’s primary use gets over.

·         Now 15.0 in NY.

·         Job Advts (data) can discorer, “skills required” by employers, across Industries, Regians, Time periods

Pg. No. 100

·         (www.Belong.co) is trying use sueh sources for finding suitable (passive) candidates.

·         How can I use site-visit logs captured by HISTAT & feed JIT for my websites for last 8 yrs?

Pg. No. 101

·         Read covering letter of my report, “QUO VADIS” ( 1987)

·         Where I have talked about “Information” as most imp-source for business.

Pg. No. 106

·         When a company is a “Growth Mode”, it hires lots of people. So, no. of job advts. released by a company (or an Industry) can be extrapolated to predict its future sales.

·         Analysing 4 million job advt over 8 yrs can show how jobs are moving from one region another, over a period of time. JOB TRAFFIC

Pg. No. 109

·         In B2BmessageBlaster, we are showing ballons on a map of India, to indicate No. of companies, in a given city. Same can be done with, “No. of jobs”.

·         We have tried this in B2B

Pg. No. 110

·         What type (designations / vacancy name or skills) of persons, was a company or an Industry, hiring 8 yrs ago, may not help to predict what it will hive in next quarter. But measuring historical “Shifts” is useful in other ways.

Pg. No. 112

·         In most of our 4 million job advt database, “Salary” field has no value supplied by advertiser.

Pg. No. 114

·         “Wisdom of the crowd”

·         Herd mentality. I would buy kindle if I could make notes in margin by writing (as I am doing right now) – instead of having to TYPE.

·         I sent Jeff Bezos this suggestion thru email. No reply.

Pg. No. 115

·         Sanjay Surma (MIT), is on board of edx. Read his email to me, in blog “Oracle has spoken”, agreeing with my idea to embed REID Sensors in Rs. 500/1000 currency notes.

Pg. No. 117

·         About a year back, TRAI put up on its website, email ID’s of about a million persons, who sent feedback on “Net Neutrality”. When I found out, I got Sanjivani to download these. For past one week, she is sending my email to these people at 6000 per day.

Pg. No. 122

·         In B2BmessageBlaster, we

·         Got company (Name / Ind.) data from [Job Advt] [RSS feeds] from leading jobportals (Source)

·         Country-wise investments (into India) data from framing appropriate Google Alerts (Source)

·         So far, we have not been able to monetize, the latent “value” of these data, since we do not have money to “promote / publicize” B2B.

Pg. No. 124

·         In CustomizeResume.com, we get job-advt RSS feeds from just 3/4 major Indian jobportals, which, currently give us, approx. 1000 / 1500 records / day. But suppose we could get 1 lakh per day?

·         Like our Job Advt RSS feeds

·         4 million job advt can help, compose a new job advt or a job description

Pg. No. 126

·         I suppose, any time they want, Naukri / Monster / Times Job, can stop their RSS feeds to us!!

·         It & when re-launched Global-Recruiter could be one

Pg. No. 127

·         Just imagine, it hundreds of job portals, become Partner websites of Global Recruiter and all their “Search queries” (job-search / resume search), were captured in a CENTRAL DATABASE we could wterally have a database of intentions.

·         What a co-relation

·         If B2B were to be FREE, could we capture coud use BLAST DATA?

Pg. No. 128

·         I must prepare a list of what / how many types of “ANALYSIS / PREDICTIONS”, can be make (by IT / ED), if RFID sensors got embedded in notes.

Pg. No. 130

·         Usnally each job advt is for ONE vacancy against which, may be 50 / 100 persons “APPLY”. This ratio gives the “Probability”.

·         Big job portals have this data for millions of job-advts of every type But they are not botheved to analyse and pullish their findings.

Pg. No. 131

·         Today’s paper has an article about Arun Murthy of Horton Works, a big data consultant, which touched revenue of $ 100 million in 4 yrs!!

Pg. No. 132

·         Pead my today’s blog : Currency Maps of India.

·         One year old idea, refined by this book.

·         Can be done by analysing 1 lakh job advts posted DAILY on all Indian Job Portals

Pg. No. 134

·         Bangaloxe.

·         I have ideas but no skill.

Pg. No. 135

·         Our job advt RSS feeds from jobportals

·         This gives an idea we supply Job Market Analysis free to those job portals which give us RSS feeds of jobs onther sites

Pg. No. 136

·         Our “Job Market Analysis”, will clearly show in which cities / states are jobd growing / reducing.

·         We are using Histat & Feed JIT

Pg. No. 137

·         This is why job-advt RSS feeds from hundreds of job portals (will need to be normalized), will poduce powerful predictive analytics.

Pg. No. 138

·         If a job portal provides RSS feeds to jobseekers then the process is automatic, without need for any permission.

Pg. No. 141

·         What are the co-relations between:

·         No. of candidates applying against any advt and

·         Company / Position / Salary / Posting location etc?

·         Ave. “read” of my poems is 53.8 on blongsite – ite hindi leing 63 readers

Pg. No. 142

·         Anthony.goldbloom@kaggle.com

·         Anthony@kaggle.com

Pg. No. 144

·         If Naukri / Monster / Timesjob, had analysed, the no. of candidates “Applying Online”, aganst each of their million of job-advt, they could easily predict the number that will apply when a new job gets posted.

Pg. No. 148

·         Job-portals are well placed to “pool”, job-search data of millions of jobseekers and thousands of employers (conducting “Resume Search”).

Pg. No. 153

·         In whatever way, we may want to use it?

Pg. No. 160

·         In IndiaRecruiter / CustomizeResume, we are gencrting “FUNCTION PROFILES”, by finding certain “Keywords” in resumes of candidates.

Pg. No. 164

·         His most famous story: He asked for a report on the “Number” of ----- of collars worn by defense forces and-6 months later – get a fat report, indicating dozens of sizes without reading the report, he scribbled 1 ½” for every one – and retumed the report for implementation

Pg. No. 167

·         Ha Ha!!

Pg. No. 176

·         See “TENURE PROFILE” graph in IndiaRecruiter.net

Pg. No. 179

·         For each “function” (eg; sales-production-finance), I compiled hundreds of keywords, and assigned “Weightage” to each, based on “Frequency of Ocurrence” to develop RAW SCORE

Pg. No. 180

·         To know how algorithms may, one day, keep City-Footpaths clean (from encroachment), read my blog “1 DOWN : 2 To Go”.

Pg. No. 183

·         Google / MS / Amazon / E. bay / You Tube / Linked In / Facebook etc.

Pg. No. 189

·         Received this morning, an email from Travis Martin, (Support Mgr. - WATSON Analytics Help Team), saying, IBM will help us execute “Job Market Analytics” project. Great news

·         Having a conf. call tomorrow wite Fabian of IBM Singapore re; WATSON

Pg. No. 190

·         Those 2 + million job advts over the past 5 yrs

·         Will “Job Mkt Analytics” help freme Policy?

·         If we succeed, we will get more RSS feeds frim many portals

Pg. No. 191

·         Since we did “Field Mapping”, 5 yrs back, jobportals may have added “New” Industries / Functions, or given them “New” Names – which we are not aware of.

Pg. No. 192

·         If displaying “Job Mkt. Analytics” on CustomizeResume builds up huge traffic we should consider reviving “My Jobs” app (combined with “RESUME BLASTER”.)

Pg. No. 193

·         Also read my blog:- 3D-Digital Delivery of Drugs

Pg. No. 194

·         Last week, Scientists, published algorithm to predict “Terrorist Attacks”, based on analysis of 14000 such attacks, over last 20 years!

·         My suggested mobile app [BANMALI] can help over-come malnutrition in India see my blog of yesterday.

Pg. No. 195

·         “Job Mkt. Analytics” will tell, students what “skills” to aequire, and tell colleges, what “skills” to teach / import.

Pg. No. 201

·         We are getting 1500 new job ads daily

·         Ever-changing Truth

Pg. No. 202

·         This means, if we can find those 2 million [OLD] job-advts from IndiaRecruiter database and then add these up into 2 million [NEW] job-advts of CustomizeResume, our prediction will get better.

Pg. No. 203

·         Our job-advt RSS feeds

Pg. No. 204

·         Read my 12/13 year old note;

·         Horoscope (and other equally fictitious stories)

No comments:

Post a Comment