Hi Friends,

Even as I launch this today ( my 80th Birthday ), I realize that there is yet so much to say and do. There is just no time to look back, no time to wonder,"Will anyone read these pages?"

With regards,
Hemen Parekh
27 June 2013

Now as I approach my 90th birthday ( 27 June 2023 ) , I invite you to visit my Digital Avatar ( www.hemenparekh.ai ) – and continue chatting with me , even when I am no more here physically

Monday, 26 July 2021

BIG DATA - JOHN MURRAY

 


====================================================================================

When I read any book, I scribble my comments / notes in the margins


These reflect my views / opinions about what the author is saying – including my disagreement

Often, my comments are in the nature of telling myself :

Hey ! We should try out this idea in our own business ( Head-hunting / Online Recruitment )

Following are my comments re :

==================================================================================

 

Page no. 2

·         Highest frequency.

Page no. 4

·         If we succeed in analyzing over 6 million job advts compiled over last 6 years –and plot “trends” we too, may be able to do some predictions like this

Page no. 6

·         By analyzing 6 million job advts, we may be able to “predict,” who (which company/Industry) will need whom (position/designation/title) and when (time) that can change online job-portals!!

Page no. 7

·         We may enable, perfectly suited candidates (“Match India” based on Raw scores), to apply to the most needy, companies at just the Right time! no need for job search!!

Page no. 8

·         Also the Basis of our “jobs Recommendation system”

Page no. 12

·         Our “Archive jobs” database grows by ~1000 per day

·         Jobs with Resumes

·         Can we “monetize” 6 million job advt databases?

Page no. 13

·         Getting RSS feeds from just 4 job-portals is not good enough to “predict” jobs in Indian Economy. We need many more RSS feeds.

Page no. 14

·         No “effect” without a corresponding “cause.”

Page no. 16

·         For 15 minutes (on Google), I could not find a single website, listing all “professional Associations/Bodies” of India, on one page! May be, I will end up paying Rs.5000/= to some Database vendor, who spent months compiling such a database. If we somehow compile such a list, we could possibly “monetize it.

Page no. 18

·         As compared to very precise/correct/and “verified” few entries, of “Encyclopedia Brittanica.” Wikipedia has millions of subject Articles, of lesser exactitude.

Page no. 20

·         I literally got frustrated while searching Google for “professional Associations of India” and gave up Imagine trying to compile a database of port Assoe. of entire world!

Page no. 22

·         I used these (to type out my name) on an IBM 1401 computer, in 1957, at uni.of Kansas. At L&T, powai works, we were still using punch-cards in 1960~ &70~

·         My MS thesis was “use of Ratio Delay method on clerical activities,” themselves were “Random,” but I make observations of a different clerical worker every minute.

Page no. 23

·         95% (i.e. +26)

·         +36would give you99% confidence level

·         SQL`S UCL & LeL of process variations.

Page no. 27

·         We just keep adding approx. 1000 job-advts to our “jobs Archive” database, EVERY DAY! But that no. could still be, only a “sample” of the total population of job- advts getting published/expired, daily!!

·         We need to very or develop on our own

Page no.27

·         We have 6+Yrs. of job-advts

·         Points lying outside +36?

Page no. 28

·         DO companies have “lean” (few) and “Fat” (many) periods of job-advts? When do these occur? Do these “coincide” with some other thing such as 1 Year 2. Campus interview time 3. Announcer meats os Annual Results?

 Page no. 30

·         Some headhunting/recruitment firms and some job-portals/routinely conduct annual “surveys” of hiring trends/ job outlook etc. based on questionnaires circulated to 500/1000 companies. A small sample from over 50,000 who advertise their jobs on job portals If we succeed in analyzing 6 million job-advts over last 6 years we can predict WITHOUT sampling.

Page no. 34

·         From database vendor Amit (Ahmedabad), we purchased a database of jobseekers, consisting of 11.5 million records (each having 26 fiends) after removing (cleaning?) NULL value fields, we got only 1.6 million these are waiting to be uploaded on our website.

·         At least OFFLINE (in a “project” setting”) we should upload ALL 11.5 million records and then conduct our tradition “RESUME SEARCH” to see what kind of RESULTS do we get!

Page no. 35

·         Since we know that a typical RESUME SEARCH, does not entail, use of all 26 fiends, it is very likely that we will  

Page no. 38

·         About 15 days back I used it to translate one of my poems and it was not too bad!

Page no. 39

·         I have written notes, using this concept to “alter” the “weightages” of keywords (to compute raw scores), based on most recut “frequency of usage” in job Advts & Resumes I write this 5 yrs. back!

·         We used this logic in developing our Resume parsing algorithm in Recruit Guru.com

Page no. 43

·         On customize Resume, there are ~20000 LIVE jobs & 1.2 million “ARCHIVAL” jobs for searching of both of these (small) databases, we have got them “structured” and search with specific” search parameters.”

Page no. 46

·         Eg: we have approx. 3million job-advts.in India Recruiter.net database, AND 2.5 million in customize Resume.com

Page no. 47

·         Shuklendu – Recall our meeting in your office on 3 rd Oct. when I pointed out that the database of 11.6 million candidate that we recently purchased has 26 fields but quite a few records have 1 or 2 or 3 fields wrong or black. We must try to use these B2C.

Page no.48

·         Neither customize Resume, nor B2B, nor B2C, need a high level of accuracy of “search Results,” i.e. “Mailing lists,” which are perfect.

Page no. 50

·         Like our planned “job Recommendation System”

Page no. 52

·         Some day we must use this logic far B2B & B2C as well. All we need to do is to capture/store all data, such as:

·         Messages

·         Mailing lists

·         Cities /countries

·         Industries

·         Time data

·         Recipient Responses

·         Senders of messages

Page no. 62

·         Shuklendu our RESUME BLASTER app, picks up-and stores- endive CONTACT LIST from user`s mobile.

Page no. 63

·         Six Degree

·         Separation?

·         See, serial “Through the worouhole with Morgan Freeman.” Scientists are coming around this view!

Page no. 65

·         Little my sister Uma was one and I still feel responsible for her death, some 50 years ago.

Page no. 76

·         Can we find a way to monetize our archive of 6 million job advts?

Page no. 78

·         I believe in our library

Page no. 84

·         N Gram project of Google covers 100 million books!

Page no. 90

·         All over the world commuting to work, in big cities, is getting a night mare more & more people wish to work within a walking distance of their homes so, job-alerts for jobs on their streets, could get interesting

Page no. 91

·         Today`s newspaper talks of algorithm, which “studies” tweets of people who “tweet” hundreds of times every week. From such analysis, S/W, plots “past events” in that person`s like!!

Page no.92

·         See my handwritten note re: predictions of “when” an executive is most likely to change his job, based on analysis of “Tenure/periods” of thousands of SIMILAR execs

·         IF you are a Google + person and click on “Like ”+icon on any product/service,

·         Then Google will pop-up your photo/name, next to its Adubrds/Adsense Ads on millions of website (without your permission!) this is reported in todays papers.

Page no.94

·         Our “Resume Blaster” app also picks up “CONTACTLISTS from mobile of users.so we know, who is connected to whom

·         Next: who called up whom and when

·         In past 6 months, I fell unconscious in bathroom twice!

Page no. 101

·         Read my 1987 report, titled “Quo Vadis” re: future growth of L&T.

Page no. 104

·         On our “job suggest” page we are compiling the frequency with which a jobseeker selects an “INDUSTRY-then re: arranging the drop list for him

Page no. 106

·         We have accumulated approx. 5 million job advts over past 5/6 years and Date wise. If we find a peak during a certain period (for any company or Industry), will we find a peak in its “sales” etc. in following period?

·         3P is offline and has accumulated 1.3 terabytes of data since 1999. Someday we should use this to predicted candidates are most likely to get selected by our clients, for each incoming mandate.

Page no. 110

·         Most recent “history” should get higher weightage

Page no. 114

·         What I am doing on this hard copy! May be one day, hard copies may have chips embedded that will transmit this data to Author/Publisher/seller etc.!!

Page no. 121

·         Since 1999, we have accumulated 1.3 terabytes of data. How can we “Re-use” it to our advantage? What “co-relations” can we establish between “Resumes” and “Successful” candidates?

·         By “successful” I mean, those candidates who finally got appointed with our clients, after several “Elimination” steps/processes. In past 14 years, maybe we managed to appoint 1000 executives, having initially shortlisted 50,000 resumes. What “resumes” got rejected at which stage and why?

Page no. 123

·         Like our RSS feeds for job-advts. From 4 job portals thru which, we have accumulated 5+million job ads.
How can we use this to “predict” who will advertise what job, and when

Page no. 124

·         Job portals have huge data in from of

·         Job Advts &

·         Resumes.

Page no. 126

·         “What business are you in?” Theodore Levitt in “Marketing Myopia”

Page no. 128

·         What “Keywords” found in a person`s “Resume,” co-relate with

-          Designation Level

-          Salary

-          Tenure

-          “Success” at job interviews?

Page no. 129

·         We have 5 million job-advts collected over last 6 years.

Page no. 130

·         Would jobseekers, be interested to know, which industry company/region, would need many employees, and when?

·         And which companies are most likely to be hiring freshets? And at which period of the year? I am sure such “Predictions” would be of interest to job seekers. Using our database of 5 million job-advts. We can make such predictions.

Page no. 131

·         If they click B2B and B@C, will have a lot of data accumulation, by way of

·         Who (senders of mktg. messages & Recipients)

·         When

·         Why

·         Where (Cain tries/cities)

·         What

·         We can capture this data and thru Re- Use it over time

Page no. 132

·         Only yesterday I suggested to Nitrn (sentient) to use Google Maps, when our site pops-up the list (Names of Recipients with their cities) when user clicks a link, clicking which will pop-up Google Map, showing where exactly that recipient is located.

·         In B2B, marketers would love to know, who read their message how many read it, who & how many responded we plan to compile and make this feedback available to the marketers

Page no. 133

·         By comparing/tabulating “% age Response” to all the marketing massages, sent out by all the subscribers, we would come to know, WHAT kind of messages generate maximum response- by arranging descending order of “% Response As soon as a marketer composes a message we might even be able to “predict” what % age of repaints will respond

·         Such an ability to predict and advise marketer to modify his message could be a big competitive advantage!

Page no. 135

·         We are “aggregating” job- advts thru RSS feeds. We have “purchased, jobseekers & Employers databases we are “compiling” from many websites, database of

·         Chambers of commerce

·         Trade Asso

·         Professional Asso. etc.

·         We are it data intermediaries

·         Like in our own Resume Blaster” app.

·         We need many more job RSS feeds from which we can extract corpo.

-          Name

-          City

-          Industry

-          Email  

·         To plug in to B2B.

Page no. 136

·         On B2B (in course of time), we may have, as “subscribers.” Say, auto companies from several countries, each a member of respective country`s AUTO IND TRADE ASSO. May be this data may motivate Trade Asso. Themselves to subscribe!

·         On our blogging site, we have counters from

·         His tat &

·         Feedjit

·         B2B will enable us to compile data about WHO is advertising WHAT

·         (Products /services), to WHOM (recipicuts), and WHEN and getting HOW MUCH response. This analysis, will be extremely useful to online/offline, ADVT Industry

Page no. 137

·         We will show (enable access) to our various predictive analysis, only to those organizations, which become our “subscribers.” And we will stipulate separate “per view” fee for viewing analytics-as opposed to sending out of marketing messages.

Page no. 144

·         It is all about Prediction

Page no. 145

·         Once B2B is one year old (assuming we have plenty of marketing massages, emailed by then, to lakhs of recipient companies), we can “predict” whether a just –typed marketing message response from recipients. If not s/w will advise “changes”

Page no. 147

·         Thru RSS feeds from just 4 job-portals, our “job Archive database is growing at, approx.1000 job-advts/day, without incurring any expenses! Imagine, how much it would grow if we manage RSS feeds from 40 job-portals!

Page no. 153

·         “Terms and conditions” on all websites

Page no. 156

·         Shuklendu

·         Only yesterday, France & UK, were up in aurous against NSA, for “spying” on their citizens!

Page no. 158

·         By data analyzing resumes of thousands of executives  and co-relating their “tenures” with their “Age,” can we predict which executive is most likely (+36) to quit his job? And when? I think we can.

Page no. 159

·         In must the same way that a company wants to prevent resignations of competent executives

Page no. 161

·         Exactly same opinion was expressed in a recent episode of TV serial, “thought the wormhole, with Morgan Freeman”

·         Eg: probability that he would “resign”

Page no. 167

·         How do we measure “job performance of executives that 3P has got appointed with clients? Not possible But we may have enough post dsk to co-relate “successful” candidates (those that got appointed) with

·         Edu- Qudi

·         Yrs .of exp

·         Current design

·         Salary earned

·         Past “pate of Rise” in hierarchy. Etc.

·         Companies seem to be hiring “Resumes”!

Page no. 168

·         I held similar belief while launching India Recruiter.net and in last 2010, customize Resume.com first was closed down and 2nd is still struggling!

 

 

 

No comments:

Post a Comment