Today I have started practicing coding. There are 1000 questions there, 365 days in a year. I shall solve at least 3 questions per day, per week. Let the fun begins.
— Update 14 March 2019 —
5 Easy, 10 Medium, 1 hard
Today I have started practicing coding. There are 1000 questions there, 365 days in a year. I shall solve at least 3 questions per day, per week. Let the fun begins.
— Update 14 March 2019 —
5 Easy, 10 Medium, 1 hard
I have always been interested in stock trading. It allows one to participate in business without actually need to start one. I didn’t have a formal education on this area but to me, it is as simple as finding a balance between market confidence towards the company, companies profitability, gauging markets response towards news and understanding patterns of speculation. I managed to win a Stock Trading competition straight out of college without any prior experience on the matter. Following are some of my thoughts.
to be continued…
to be continued…
Went to the AWS Summit 2018, Singapore yesterday. Regretfully those pictures that were taken were meant for reporting purposes only. Should have thought of capturing more of the entire experience. Anyways…
Speaker was sharing about how companies from big to small have used AWS services to undergo digital transformation. FINRA is one of their biggest customers, which started undergoing digital transformation 5 years ago and now they are processing 500 billion transactions daily.
AWS Snowball is a new device for data transportation meanwhile AWS Snowball Edge has onboard computation capability. For an example, Oregon State University uses it for their maritime exploration. They could run AWS Lambda onsite and offline.
AWS has 360 encryption capability which ensure the security of data.
AWS SageMaker builds, trains and deploys Machine Learning models, AWS Lex is speech-to-text and Natural Language Processing ( NLP ) service, meanwhile AWS Polly is text-to-speech, single message, multiple languages.
AWS Glue is serverless, it explores and build catalog on data, which connects data and analytical services. AWS Athena allows direct search on AWS S3 without having the need of using a server.
AWS also offers elastic GPUs. AWS EKS is managed Kubernetes service which utilizes AWS containers hosting technology. AWS Fargate is container hosting without hosting servers or clusters which is parent to AWS EKS and Docker version of AWS EKS that is AWS ECS.
FinAccel allows retail customers to purchase on credit ratings. Their app revolves around 2 real time engines;
i) Real time credit risk assessment,
ii) Real time transaction.
3 steps of using their app;
i) Download,
ii) Connecting social media, bank, e-commerce accounts and etc, get approved
iii) Use the app like credit card.
Users can payback within 30 days without any charge or all the way up to 12 months installment plan. They are trying to solve lack of access to retail credit in Indonesia. There is only 10 % of middle class, around 7 to 8 million people in actual figure, has access to any unsecured credit product from banking institution, hence they turn to expensive consumer finance companies which charge them an arm and a leg in order to purchase something offline.
The massive drivers for retail credit are
i) the young population with 50% of the total being younger than 30 years old,
ii) High smart phones penetration, around 110 millions to 120 millions Indonesian carry a smart phone.
Indonesia is the perfect storm to the culmination of a very large young population which is mobile first, but denied access to credit.
2 major challenges when they first started up;
i) Scalable processing of unstructured data,
ii) High frequency, low latency transaction.
Fraud engine was also deploy to detect fraud in real time.
4 Business layers;
i) Application layer: OCR and credit risk approval platform,
ii) Transaction engine: Allows users to buy at e-commerce stores.
iii) Data layer: ETL, real-time data processing at 3 different points, 1) Apply, 2) Buy, 3) Collect money from user,
iv) Integration layers: connects to endpoints on both sides, users and vendors. At technical level, they have completely moved on from ec-2 instances to AWS serverless ecosystem, in microservices manner.
3 demos in 30 minutes.
i) Serverless Map Reduce
If we look at our usage of clusters, usually its only active during working hours. There’s a high wastage for the remaining 16 hours. Map Reduce is resource heavy application so a massive cluster will be required shall we decide to use servers.
ii) Selfie Challenge
The challenge is about people taking selfies that are the most relevant to various emotion categories.
iii) Peta Bencana
Crowd Source information dissemination regarding natural disaster.
Those demo was all about serverless and the message they were trying to share was probably serverless is especially useful when the usage is sporadic, intermittent like campaigns or events which would see a surge.
5 pillars of modern day architecture:
i) Operational Excellence
360 monitoring, automation, learning from experience
ii) Security
All level. Trace everything. Automate responses to security events. Secure system at application, data and OS level. Automate security best practices
iii) Performance Efficiency
Use up-to-date technologies. Deploy system globally for lower latency. Use services rather than servers. Try various configurations for optimal performance. Innovation at faster pace.
iv) Reliability
Test recovery procedures, automate recovery.
v) Cost Optimization
Use managed services, do not invest in data centers, pay as you go policy for cloud.
Alex Poe was a lawyer but is now a lead solutions engineer. DataDog is SaaS-based monitoring and analytics infra. He was explaining about Single Responsibility Principle which was demonstrated through AWS Lambda, FaaS. AWS Lambda then could be orchestrated using AWS Step Functions which act like schedulers such as Azkaban and Airflow. Step functions could be built using Serverless Framework which is compatible with various cloud providers.
Long story short, there are 3 different serverless architectures, i) Synchronous (Serial), ii) Asynchronous (Parallel) and iii) Stream-based. AWS API Gateway will serve as a front from AWS Step Functions.
AWS Codestar helps with CI/CD which I am not knowledgeable enough to comment. Basically it was about automating serverless deployments using Github and AWS Codestar integration.
AWS IoT and AWS Greengrass. AWS Greengrass is intranet IoT. Rotimatic has a very cool use case of IoT. Rotimatic basically makes roti automatically. Users will have to put in flour, water and spices and the machine will make the roti. What’s innovative was recipes in machine-form setting could be uploaded and downloaded to Rotimatic, e.g.: specifying heat, time, amount of ingredients and users could then have a new roti to try. The machine also allows user to give feedback on the roti made. Predictive maintenance is another outstanding feature.
AWS Elastic Container Registry is Docker registry ( Repository ) on AWS, AWS Elastic Container Service runs Docker images serverlessly. AWS CodeCommit is like Bitbucket or GitHub, AWS CodeBuild is like Jenkins, AWS CodePipeline is the process of unit, integration, system, acceptance testing, etc ( refer CI/ CD ). AWS Elastic Service for Kubernetes is just like AWS ECS but Kubernetes version. AWS Fargate is their parent.
There are a lot of highly-powered brains out there turning their ideas of Data products into reality through Spark. Often times, the default packages or functionalities could not satisfy their need. We can learn more about what others have been doing with Spark here.
So I was trying to classify paragraphs into respective groups with characters all swapped up and spaces removed. At first I thought of using a combination of MCMC word decryption and space inferencing for preprocessing step so that I could use RNN naturally to perform classification, however to no avail, it kept on getting trapped in local minima. Then I tried to search for better solution and I’ve stumbled upon Character-level Convolutional Neural Network for Text Classification. The network was designed as follows:
This dataset consists of 26 characters i.e. no information on spacing, period, starting of a sentence and so on. Probably that’s the reason why the validation error didn’t decrease as much. However, with 30% validation, the model managed to achieved an accuracy of 70%.
[python]
class CharCNN(chainer.Chain):
def __init__(self, seq_length, out_size, dropout=0.2, usegpu=True):
super(CharCNN, self).__init__()
with self.init_scope():
self.encoder = Encoder(27, 386, dropout)
self.conv1 = L.Convolution2D(
386, 386, ksize=(7, 1), stride=1, pad=(3, 0), initialW=I.Normal(0.025))
self.conv2 = L.Convolution2D(
386, 386, ksize=(7, 1), stride=1, pad=(3, 0), initialW=I.Normal(0.025))
self.conv3 = L.Convolution2D(
386, 386, ksize=(3, 1), stride=1, pad=(1, 0), initialW=I.Normal(0.025))
self.conv4 = L.Convolution2D(
386, 386, ksize=(3, 1), stride=1, pad=(1, 0), initialW=I.Normal(0.025))
self.conv5 = L.Convolution2D(
386, 386, ksize=(3, 1), stride=1, pad=(1, 0), initialW=I.Normal(0.025))
self.conv6 = L.Convolution2D(
386, 386, ksize=(3, 1), stride=1, pad=(1, 0), initialW=I.Normal(0.025))
self.fc1 = L.Linear(None, 386)
self.fc2 = L.Linear(386, 386)
self.fc3 = L.Linear(386, out_size)
self.usegpu = usegpu
self.dropout = dropout
def __call__(self, x):
h0 = self.encoder(x)
h1 = F.relu(self.conv1(h0))
h2 = F.max_pooling_2d(h1, (3, 1), 1, (1, 0))
h3 = F.relu(self.conv2(h2))
h4 = F.max_pooling_2d(h3, (3, 1), 1, (1, 0))
h5 = F.relu(self.conv3(h4))
h6 = F.relu(self.conv4(h5))
h7 = F.relu(self.conv5(h6))
h8 = F.relu(self.conv6(h7))
h9 = F.max_pooling_2d(h8, (3, 1), 1, (1, 0))
h10 = F.relu(self.fc1(h9))
h11 = F.relu(self.fc2(F.dropout(h10, ratio=self.dropout)))
h12 = self.fc3(F.dropout(h11, ratio=self.dropout))
if chainer.config.train:
return h12
return F.softmax(h12)
[/python]
Network 2 is lighter in terms of computations however sharing the same performance. I shall try RNN next.
[python]
class CharCNN(chainer.Chain):
def __init__(self, seq_length, out_size, dropout=0.2, usegpu=True):
super(CharCNN, self).__init__()
with self.init_scope():
self.encoder = Encoder(27, 54, dropout)
self.bn0 = L.BatchNormalization((54, 452))
self.conv1 = L.Convolution2D(
54, 108, ksize=(7, 1), stride=2, pad=(0, 0), initialW=I.Normal(0.025))
# ceil((452 – 7 + 1) / 2) = 223
self.bn1 = L.BatchNormalization((108, 221))
self.conv2 = L.Convolution2D(
108, 216, ksize=(7, 1), stride=2, pad=(0, 0), initialW=I.Normal(0.025))
self.bn2 = L.BatchNormalization((216, 106))
self.conv3 = L.Convolution2D(
216, 512, ksize=(3, 1), stride=2, pad=(0, 0), initialW=I.Normal(0.025))
self.bn3 = L.BatchNormalization((512, 50))
self.conv4 = L.Convolution2D(
512, 1024, ksize=(3, 1), stride=2, pad=(0, 0), initialW=I.Normal(0.025))
self.bn4 = L.BatchNormalization((1024, 22))
self.conv5 = L.Convolution2D(
1024, 2048, ksize=(3, 1), stride=1, pad=(0, 0), initialW=I.Normal(0.025))
self.bn5 = L.BatchNormalization((2048, 18))
self.conv6 = L.Convolution2D(
2048, 4096, ksize=(3, 1), stride=1, pad=(0, 0), initialW=I.Normal(0.025))
self.bn6 = L.BatchNormalization((4096, 14))
self.fc1 = L.Linear(None, out_size)
self.usegpu = usegpu
self.dropout = dropout
def __call__(self, x):
h_0_1 = self.encoder(x)
h_0_2 = self.bn0(h_0_1)
h_1_1 = F.leaky_relu(self.conv1(h_0_2)) # 223
h_1_2 = F.max_pooling_2d(h_1_1, ksize=(3, 1), stride=1, pad=(0, 0)) # 221
h_1_3 = self.bn1(h_1_2)
h_2_1 = F.leaky_relu(self.conv2(h_1_3)) # 108
h_2_2 = F.max_pooling_2d(h_2_1, ksize=(3, 1), stride=1, pad=(0, 0)) # 106
h_2_3 = self.bn2(h_2_2)
h_3_1 = F.leaky_relu(self.conv3(h_2_3)) #52
h_3_2 = F.max_pooling_2d(h_3_1, ksize=(3, 1), stride=1, pad=(0, 0)) # 50
h_3_3 = self.bn3(h_3_2)
h_4_1 = F.leaky_relu(self.conv4(h_3_3)) # 24
h_4_2 = F.max_pooling_2d(h_4_1, ksize=(3, 1), stride=1, pad=(0, 0)) # 22
h_4_3 = self.bn4(h_4_2)
h_5_1 = F.leaky_relu(self.conv5(h_4_3)) # 20
h_5_2 = F.max_pooling_2d(h_5_1, ksize=(3, 1), stride=1, pad=(0, 0)) # 18
h_5_3 = self.bn5(h_5_2)
h_6_1 = F.leaky_relu(self.conv6(h_5_3)) # 16
h_6_2 = F.max_pooling_2d(h_6_1, ksize=(3, 1), stride=1, pad=(0, 0)) # 14
h_6_3 = self.bn6(h_6_2)
h7 = F.average_pooling_2d(h_6_3, ksize=(14, 1), stride=1, pad=(0, 0)) # op kernel
h8 = self.fc1(h7)
# h11 = F.relu(self.fc2(F.dropout(h10, ratio=self.dropout)))
# h12 = self.fc3(F.dropout(h11, ratio=self.dropout))
if chainer.config.train:
return h8
return F.softmax(h8)
[/python]
References:
After developing a Deep Neural Network model, I have decided to kick it up a notch by trying out with RNN. The RNN model is a regression model that predicts users’ next best items, either for browsing or purchasing. Since it’s a regressor, there’s no hard single item that would be recommended. Instead, it’s predicting values of the next best items features which could be compared against our inventory of items. This way similarity ( 1 – distance ) could be computed and items could then be ranked accordingly. The RNN model that I’ve designed is as follows and it was coded in Chainer.
[python]
class RNN(chainer.Chain):
def __init__(self, itemEncoder, itemDecoder, n_feas, n_units, dropout_rate, use_gpu=False):
super(RNN, self).__init__()
with self.init_scope():
self.dropout_rate = dropout_rate
self.itemEncoder = itemEncoder
self.l1 = L.LSTM(n_units, n_units)
self.l2 = L.LSTM(n_units, n_units)
# self.l3 = L.GRU(n_units, n_units)
self.itemDecoder = itemDecoder
self.use_gpu = use_gpu
for param in self.params():
param.data[…] = np.random.uniform(-0.1, 0.1, param.data.shape)
def reset_state(self):
self.l1.reset_state()
self.l2.reset_state()
# self.l3.reset_state()
def __call__(self, xs):
if self.use_gpu:
xs = F.transpose(xs, (1, 0, 2))
else:
xs = np.transpose(xs, (1, 0, 2))
for x in xs:
h0 = self.itemEncoder(x)
h1 = self.l1(F.dropout(h0, self.dropout_rate))
h2 = self.l2(F.dropout(h1, self.dropout_rate))
y = self.itemDecoder(F.dropout(h2, self.dropout_rate))
return y
[/python]
Will continue on production infra later…
We’ve already know what Deep Neural Network is but what about it’s breadth. A broad deep learning means the use of features from different domains that would be combined through an embedding/ mixing layer. In the case of user-item interaction, it would be a Neural Network trained on user data combined with another Neural Network trained on item data through an embedding layer which serve as the first layer of the final Neural Network.
Transfer Learning bears the similar idea as Deep Neural Network (DNN). Using a more obvious example that is image processing, what DNN (specifically Convolution Neural Network (CNN)) does at earlier layers (#hidden layers – 1) is extracting features such as edges, colours, combination of colours, combination of edges, area of focus and so on. This process could be thought as preprocessing which resulting “features” are usually very hard to comprehend. With the last layer being the output layer, second last layer works like domain mapping whereas the output layer serves as discriminating layer. What’s domain mapping exactly? For an example, a single person could have features such as age, interests, favorite movies, genre of songs, etc. This information could be used to predict state of emotion, this could also be used to estimate one’s income group. However not all the layers are relevant but in the case image, there’s a very narrow range of exploitable features as aforementioned, i.e.: edges…. These features are highly reusable across different domains of problem. For an instance, to predict types of attire like dress, shorts or to predict female or male attire, fashion-ability and so on. The second last layer would learn how the relationship of these colours, shapes and edges to the domain of problem whereas the output layer learns the decision boundary of these data points in the domain space. The benefit of transfer learning would then be the re-usability of hidden layers which could be very expensive if retrained. One just need to swap out the last 2 layers when applying to different domains of problem. Depending on the fitness of the model, one could actually varies the number of layers to swap out for, the last 2 layers is just the textbook example.
Recently I have grown an interest in betting industry and have decided to crawl NBA data. So, I have come up with a Web Crawler. One can use this as reference, but be warned that the code is still quite dirty, i.e.: a lot of repetition here and there as this is my first time writing crawler and I was short of time.