anyone attended AIAP course - AI.sg de

GoodBetterBest

Senior Member
Joined
Jan 23, 2019
Messages
2,382
Reaction score
350
Always will have dev, stage and prd.

Generally u dun do anything in prod. Everything needs to go thru dev, stg, prod, promotion process.

But if hyperparameter tuning is part of ur pipeline and u have gone thru the dev, stg, prd process, then yes, prd might do the retraining. But u usually will also include a validation step, where u check whether if the retrained model shld be promoted, or we shld just stick with the current model.

but don't you just promote the parameter changes as part of dev, sit, uat, stg -> prod instead of training in prd ? All parameter changes, I understand, is promoted. Unless it's system tuning specific to the prd server, eg the memory utilisation in prd.
 

xGrendelx

Junior Member
Joined
Jun 27, 2016
Messages
18
Reaction score
13
Of course, but therein lies the moral hazard, isn't it, when you have to "fit" the data? lol

Depends on which thought u prefer.

Model-centric:
  • Keep on whacking the model until u get the desired results
  • Data-wise, make sure its clean and acceptable

Data-centric:
  • Data is your core, u make sure that its as clean, no-bias, properly split and representative of real-world conditions as possible
  • Model-wise, u pick one that has been proven and well documented to be able to solve your problem
Since constantly training a model is expensive, go data-centric is usually more efficient.

Of course, the data methods in data-centric can be and should be applied to model-centric too but quite often, u see people just try to whack a model with the data that they have and expecting a good result just straight off the bat
 

fishbuff

High Supremacy Member
Joined
Jun 20, 2004
Messages
44,300
Reaction score
2,641
reading this now.. thought you might be interested too
93qnDFi.jpeg
 

eterna2

Arch-Supremacy Member
Joined
Aug 16, 2007
Messages
18,361
Reaction score
232
Its good to at least have a basic understanding of Ai/ML and how their models are created.

Even when u are just throwing data into a pre-trained model, u need to know what to look out for such as:
  • Over/underfitting
  • Adjusting your inputs to match the pre-trained
  • What actions can u take to tune/train the pre-trained model if its not performing as required
Actually data science a lot is about feature engineering, which is dependent on both business knowledge and modeling knowledge.

Yes u can use sdk. But rubbish in rubbish out.
 

eterna2

Arch-Supremacy Member
Joined
Aug 16, 2007
Messages
18,361
Reaction score
232
but don't you just promote the parameter changes as part of dev, sit, uat, stg -> prod instead of training in prd ? All parameter changes, I understand, is promoted. Unless it's system tuning specific to the prd server, eg the memory utilisation in prd.
Depends on ur use cases. But yeah, what u describe is the norm.

Cuz sometimes the stuff u r trying to predict is very dynamic. So it is abit like automated a/b testing with feedback that goes back to tune our parameters.
 

eterna2

Arch-Supremacy Member
Joined
Aug 16, 2007
Messages
18,361
Reaction score
232
Yea...there's this train of thought that with AutoML, DS/DA will have to start being more involved with other aspects and hence, FULL STACK DS lol
The reality is u need more data engineers and ml engineers than data scientists.

But the market flooded with data scientists. And org have data science team without a supporting data and ops team.

So usually end up phailed. U just need 1 data scientist to develop the model but u need a team of data engineers and ml engineers to make it go live.

Instead more often than not, it is either the ds self service with crappy codes and system design. Or u have 1 poor contracted engineer serving like 4 data scientists.
 

ahboy82

High Supremacy Member
Joined
May 28, 2008
Messages
47,992
Reaction score
1,256
Wa u guys r so pro with productionalizing ml codes. So far moi learnt coursera is machiam quick n dirty analysis leh, nvr teach abt how to put into production leh. Cham la, like that how to find new job

:(
 

dannytan87

Great Supremacy Member
Joined
Oct 3, 2011
Messages
52,245
Reaction score
915
Wa u guys r so pro with productionalizing ml codes. So far moi learnt coursera is machiam quick n dirty analysis leh, nvr teach abt how to put into production leh. Cham la, like that how to find new job

:(
ahboy so sexcited!!!!!!!!
 

jgyy1990

Arch-Supremacy Member
Joined
Apr 28, 2012
Messages
12,398
Reaction score
78
Wa u guys r so pro with productionalizing ml codes. So far moi learnt coursera is machiam quick n dirty analysis leh, nvr teach abt how to put into production leh. Cham la, like that how to find new job

:(
for me I don't even know much about ml, I just code python as if it's just a regular backend development. And I seed my data full of lorem ipsum much more than taking data from internet.
 

theMKR

High Supremacy Member
Joined
Nov 3, 2016
Messages
42,420
Reaction score
2,129
The reality is u need more data engineers and ml engineers than data scientists.

But the market flooded with data scientists. And org have data science team without a supporting data and ops team.

So usually end up phailed. U just need 1 data scientist to develop the model but u need a team of data engineers and ml engineers to make it go live.

Instead more often than not, it is either the ds self service with crappy codes and system design. Or u have 1 poor contracted engineer serving like 4 data scientists.
isnt that the sg way?

:s13:
 

Cheesebuns

Junior Member
Joined
Aug 26, 2021
Messages
1
Reaction score
0
Cool stuff! Can you let us know what the final stage interview session was like? :D
 

GoodBetterBest

Senior Member
Joined
Jan 23, 2019
Messages
2,382
Reaction score
350
It was over 1.5 hours divided into 30-minute chunks. You will be interviewed with another applicant. In the first hour, either you or the other applicant will be interviewed individually where you will go through your technical test and answer additional technical questions and then you 2 will swap positions. So it's a 50-50 chance on whether you will go first or later but either way you will have a 30-minute break in the waiting room.

The final 30-minutes you will go through a case study with the other applicant and work together to present a solution, presumably to gauge how well you work as a team.

Honestly, I don't think you can prepare for the interview at all. It seems to be more of a gauge on whether you've submitted the technical test without getting help from others since these days it's not hard to just outsource this to someone on fiver.

From my understanding, they appear to change the interview format each round so take this with a grain of salt.

Not easy to test teamwork with somebody just like this. Quite challenging.
Was the presentation about the approach to solve the problem or the entire solution ? How many interviewers ? Do the focus on technical solution only or including AI ethics, cost of implementation, feasibility, explanability, etc ?

Thanks very much !
 

aceminer

High Supremacy Member
Joined
Oct 16, 2007
Messages
43,530
Reaction score
2,618
I will not address those questions directly because I don't think that's really the point of the interview. I don't think the interviewers have a set of questions to ask. Much of it was really fluid and trying to understand you better as a person and how you fit in with the team.

Honestly a non-technical salesperson with 0 ML knowledge could just have easily completed the group exercise since they were really testing your performance in a group dynamics setting. Such interviews are commonplace in consulting and many other roles where group work is commonplace too. In fact, I believe SIA does something similar in picking flight attendants to see how you work with people.

There really is nothing to "study" for the interview. If you could pass the technical test, the interview is really just to ascertain that the submission was indeed your work and not outsourced or completed by someone else.

In fact, I believe Laurence has actually directly addressed this in the webinar that it's really more of an attitude test than an aptitude test. They have rejected people who did well in their technical tests but appeared arrogant or had a poor attitude during the interview.
All the best
 

GoodBetterBest

Senior Member
Joined
Jan 23, 2019
Messages
2,382
Reaction score
350
I understand that you need to deploy to a Linux machine for the technical assessment. For the program, do you use Linux or Windows for development ?
 

Eririn

Junior Member
Joined
Aug 24, 2021
Messages
12
Reaction score
9
I understand that you need to deploy to a Linux machine for the technical assessment. For the program, do you use Linux or Windows for development ?
Not sure about the program but I was able to complete the technical assessment without Linux machine. I was using windows and wrote the bash script with Git Bash. In fact, if you look at the laptop requirements on their site, they didn't restrict to just Linux only.
 

GoodBetterBest

Senior Member
Joined
Jan 23, 2019
Messages
2,382
Reaction score
350
Not sure about the program but I was able to complete the technical assessment without Linux machine. I was using windows and wrote the bash script with Git Bash. In fact, if you look at the laptop requirements on their site, they didn't restrict to just Linux only.

Thanks a lot. I guess it depends on the sponsor also, what platform they use.
You know, as it's mentioned in the field guide about docker, cloud, etc I thought needed to package the solution in docker and deploy on cloud. :)

There are 2 parts to the assessment, right ? EDA and ML/DL ? You use Machine or Deep Learning for it, if you don't mind telling. Thanks !!!
 

aspenco

Arch-Supremacy Member
Joined
Oct 12, 2019
Messages
17,077
Reaction score
2,565
Any anaconda experts know how to remove all similar rows between 2 data frames, and keep it in a new data frame? :(
 

GoodBetterBest

Senior Member
Joined
Jan 23, 2019
Messages
2,382
Reaction score
350
Any anaconda experts know how to remove all similar rows between 2 data frames, and keep it in a new data frame? :(

Not expert....

Assuming your original df1,df2 do not have duplicate in the first place... may be there are better ways to do this... :)

df1 = pd.DataFrame(
{
"col1": ["a", "b", "c", "d", "e"],
"col2": [1.0, 2.0, 3.0, 4.0, 5.0],
"col3": [1.0, 2.0, 3.0, 4.0, 5.0]
},
columns=["col1", "col2", "col3"],
)
print('\ndf1 = \n',df1)

df2 = pd.concat([df1.iloc[[0,2,4],:],pd.DataFrame({'col1': 'z','col2': 10.0,'col3': 20.0,},index=[0])])
df2.index = [7,8,9,10]
print('\ndf2 = \n',df2)

df3 = pd.concat([df1,df2])
df3 = df3[df3.duplicated()].reset_index(drop=True)
print('\ndf3 = \n',df3)

df1 = pd.concat([df1,df3])
df1 = df1[~df1.duplicated(keep=False)]
print('\ndf1 = \n',df1)

df2 = pd.concat([df2,df3])
df2 = df2[~df2.duplicated(keep=False)]
print('\ndf2 = \n',df2)


Output:

df1 =
col1 col2 col3
0 a 1.0 1.0
1 b 2.0 2.0
2 c 3.0 3.0
3 d 4.0 4.0
4 e 5.0 5.0

df2 =
col1 col2 col3
7 a 1.0 1.0
8 c 3.0 3.0
9 e 5.0 5.0

10 z 10.0 20.0

df3 =
col1 col2 col3
0 a 1.0 1.0
1 c 3.0 3.0
2 e 5.0 5.0

df1 =
col1 col2 col3
1 b 2.0 2.0
3 d 4.0 4.0

df2 =
col1 col2 col3
10 z 10.0 20.0
 
Important Forum Advisory Note
This forum is moderated by volunteer moderators who will react only to members' feedback on posts. Moderators are not employees or representatives of HWZ. Forum members and moderators are responsible for their own posts.

Please refer to our Terms of Service for more information.
Top