How Pinterest Leverages Realtime Person Actions in Advice to Enhance Homefeed Engagement Quantity | by Pinterest Engineering | Pinterest Engineering Weblog |

Xue Xia, Software program Engineer, Homefeed Rating; Neng Gu, Software program Engineer, Content material & Person Understanding; Dhruvil Deven Badani, Engineering Supervisor, Homefeed Rating; Andrew Zhai, Software program Engineer, Superior Applied sciences Group

Image from — black background with turquoise grid points

On this weblog submit, we’ll exhibit how we improved Pinterest Homefeed engagement quantity from a machine studying mannequin design perspective — by leveraging realtime person motion options in Homefeed recommender system.

The Homepage of Pinterest is the certainly one of most essential surfaces for pinners to find inspirational concepts and contributes to a big fraction of total person engagement. The pins proven within the high positions on the Homefeed should be customized to create a fascinating pinner expertise. We retrieve a small fraction of the massive quantity of pins created on Pinterest as Homefeed candidate pins, based on person curiosity, adopted boards, and so forth. To current probably the most related content material to pinners, we then use a Homefeed rating mannequin (aka Pinnability mannequin) to rank the retrieved candidates by precisely predicting their customized relevance to given customers. Due to this fact, the Homefeed rating mannequin performs an essential function in bettering pinner expertise. Pinnability is a state-of-the-art neural community mannequin that consumes pin indicators, person indicators, context indicators, and so forth. and predicts person motion given a pin. The excessive stage structure is proven in Determine 3.

Flow map of candidate pins going through pinnability models, becoming relevance ordered, then to Homefeed

The Pinnability mannequin has been utilizing some pretrained person embedding to mannequin person’s curiosity and choice. For instance, we use PinnerFormer (PinnerSAGE V3), a static, offline-learned person illustration that captures a person’s long run curiosity by leveraging their previous interplay historical past on Pinterest.

Nonetheless, there are nonetheless some features that pretrained embeddings like PinnerSAGE doesn’t cowl, and we will fill within the hole through the use of a realtime person motion sequence characteristic:

  • Mannequin pinners’ short-term curiosity: PinnerSAGE is skilled utilizing hundreds of person actions over a long run, so it principally captures long-term curiosity. Alternatively, realtime person motion sequence fashions short-term person curiosity and is complementary to PinnerSAGE embedding.
  • Extra responsive: As a substitute of different static options, realtime indicators are in a position to reply sooner. That is useful, particularly for brand new, informal, and resurrected customers who should not have a lot previous engagement.
  • Finish-to-end optimization for suggestion mannequin goal: We use a person motion sequence characteristic as a direct enter characteristic to the advice mannequin and optimize immediately for mannequin targets. Not like PinnerSAGE, we will attend the pin candidate options with every particular person sequence motion for extra flexibility.

With the intention to give pinners real-time suggestions to their current actions and enhance the person expertise on Homefeed, we suggest to include the realtime person motion sequence sign into the advice mannequin.

A secure, low latency, realtime characteristic pipeline helps a strong on-line suggestion system. We serve the newest 100 person actions as a sequence, populated with pin embeddings and different metadata. The general structure might be segmented to occasion time and request, as proven in Determine 2.

at event time, rockstore stores information from Kafka log via NRT/Flink App Materializer. At request time, HF logging/serving request go through Unity HF, USSv2 Aggregator, USSv2 view, then stored in rockstore and transform into merged UFr

To reduce the applying downtime and sign failure, efforts are made in:

ML facet

  • Options/schema pressure validation
  • Delayed supply occasion dealing with to forestall knowledge leakage
  • Itemized actions monitoring over time knowledge shifting

Ops facet

  • Stats monitoring on core job well being, latency/throughput and so forth.
  • Complete on-calls for minimal software downtime
  • Occasion restoration technique

We generated the next options for the Homefeed recommender mannequin:

Headers: Feature Name & Description.  pin EngagementActionTypeSequence — Users’ past 100 engagement actions (e.g. repin, click, hide, etc) pinEngagementEmbeddingSequence — Users’ past 100 engagement pins’s pinSAGE embedding pinEngagementTimestampSequence — The timestamp of users’ past 100 engagement

Determine 3 is an outline of our Homefeed rating mannequin. The mannequin consumes a <person, pin> pair and predicts the motion that the person takes on the candidate pin. Our enter to the Pinnability mannequin contains indicators of varied varieties, together with pinner indicators, person indicators, pin indicators, and context indicators. We now add a novel, realtime person sequence indicators enter and use a sequence processing module to course of the sequence options. With all of the options reworked, we feed them to an MLP layer with a number of motion heads to foretell the person motion on the candidate pin.

Diagram of Pinterest Homefeed Ranking (Pinnabilty) Model Architecture

Latest literature has been utilizing transformers for suggestion duties. Some mannequin the advice downside as a sequence prediction job, the place the mannequin’s enter is (S1,S2, … , SL-1) and its anticipated output as a ‘shifted’ model of the identical sequence: (S2,S3, … , SL). To maintain the present Pinnability structure, we solely undertake the encoder a part of these fashions.

To assemble the transformer enter, we utilized three essential realtime person sequence options:

  1. Engaged pin embedding: pin embeddings (discovered GraphSage embedding) for the previous 100 engaged pins in person historical past
  2. Motion kind: kind of engagement in person motion sequence (e.g., repin, click on, disguise)
  3. Timestamp: timestamp of a person’s engagement in person historical past

We additionally use candidate pin embedding to carry out early fusion with the above realtime person sequence options.

initial architecture of user sequence transformer module

As illustrated in Determine 3, to assemble the enter of the sequence transformer module, we stack the [candidate_pin_emb, action_emb, engaged_pin_emb] to a matrix. The early fusion of candidate pin and person sequence is proved to be crucial based on on-line and offline experiments. We additionally apply a random time window masks on entries within the sequence the place the actions had been taken inside at some point of request time. The random time window masks is used to make the mannequin much less responsive and to keep away from range drop. Then we feed it right into a transformer encoder. For the preliminary experiment, we solely use one transformer encoder layer. The output of the transformer encoder is a matrix of form [seq_len, hidden_dim]. We then flatten the output to a vector and feed it together with all different options to MLP layers to foretell multi-head person actions.

In our second iteration of the person sequence module (v1.1), we made some tuning on high of the v1.0 structure. We elevated the variety of transformer encoder layers and compressed the transformer output. As a substitute of flattening the complete output matrix, we solely took the primary 10 output tokens, concatenated them with the max pooling token, and flattened it to a vector of size (10 + 1) * hidden_dim. The primary 10 output tokens seize the person’s most up-to-date pursuits and the max pooling token can symbolize the person’s long run choice. As a result of the output measurement turned a lot smaller, it’s inexpensive to use an express characteristic crossing layer with DCN v2 structure on the complete characteristic set as beforehand illustrated in Fig.2.

Improved architecture of user sequence transformer module (v1.1)

Problem 1: Engagement Price Decay

By on-line experiments, we noticed the person engagement metrics step by step decayed within the group with realtime motion sequence remedy. Determine 6 demonstrates that for a similar mannequin structure, if we don’t retrain it, the engagement achieve is far smaller than if we retrain the mannequin on recent knowledge.

Chart of Homefeed Repin Volume Increase change by time. Blue line represents retrained model. Red line represents fixed model.

Our speculation is that our mannequin with realtime options is sort of time delicate and requires frequent retraining. To confirm this speculation, we retrain each the management group (with out realtime person motion characteristic) and the remedy group (with realtime person motion characteristic) on the similar time, and we examine the impact of retraining for each fashions. As proven in Determine 6, we discovered the retraining advantages within the remedy mannequin rather more than within the management mannequin.

Chart of Overall repin gain of sequence model retrain and control model retrain across day 0 to day 11

Due to this fact, to sort out the engagement decay problem, we retrain the realtime sequence mannequin twice per week. In doing this, the engagement charge has turn into rather more secure.

Problem 2: Serving Massive Mannequin at Natural Scale

With the transformer module launched to the recommender mannequin, the complexity has elevated considerably. Earlier than this work, Pinterest has been serving the Homefeed rating mannequin on CPU clusters. Our mannequin will increase CPU latency by greater than 20x. We then migrated to GPU serving for the rating mannequin and are in a position to maintain impartial latency on the similar price.

On Pinterest, probably the most essential person actions is repin, or save. Repin is likely one of the key indicators of person engagement on the platform. Due to this fact, we approximate the person engagement stage with repin quantity and use repin quantity to guage mannequin efficiency.

Offline Analysis

We carry out offline analysis on totally different fashions that course of realtime person sequence options. Particularly, we tried the next architectures:

  • Common Pooling: the best structure the place we use the common of pin embedding in person sequence to current person’s brief time period curiosity
  • (Convolutional Neural Community (CNN): makes use of CNN to encoder a sequence of pin embedding. CNN is appropriate to seize the dependent relationship throughout native info
  • Recurrent Neural Community (RNN): makes use of RNN to encoder a sequence of pin embedding. In comparison with CNN, RNN higher captures long run dependencies.
  • Misplaced Quick-Time period Reminiscence (LSTM): makes use of LSTM, a extra refined model of RNN that captures longer-term dependencies even higher than RNN through the use of reminiscence cells and gating.
  • Vanilla Transformer: encodes solely the pin embedding sequence immediately utilizing the Transformer module.
  • Improved Transformer v1.0: Improved transformer structure as illustrated in Determine 4.

For Homefeed floor particularly, two of an important metrics are HIT@3 for repin and conceal prediction. For repin, we attempt to enhance the HIT@3. For disguise, the objective is to lower HIT@3.

Headings: Model, hide, repin. Average Pooling -1.61{366a88e51f6022d427be3edc93068e75077f25f91c3a2483b46dc527a4d48b6d} 0.21{366a88e51f6022d427be3edc93068e75077f25f91c3a2483b46dc527a4d48b6d} CNN -1.29{366a88e51f6022d427be3edc93068e75077f25f91c3a2483b46dc527a4d48b6d} 0.08{366a88e51f6022d427be3edc93068e75077f25f91c3a2483b46dc527a4d48b6d} RNN -2.46{366a88e51f6022d427be3edc93068e75077f25f91c3a2483b46dc527a4d48b6d} -1.05{366a88e51f6022d427be3edc93068e75077f25f91c3a2483b46dc527a4d48b6d} LSTM -2.98{366a88e51f6022d427be3edc93068e75077f25f91c3a2483b46dc527a4d48b6d} -0.75{366a88e51f6022d427be3edc93068e75077f25f91c3a2483b46dc527a4d48b6d} Vanilla Transformer -8.45{366a88e51f6022d427be3edc93068e75077f25f91c3a2483b46dc527a4d48b6d} 1.56{366a88e51f6022d427be3edc93068e75077f25f91c3a2483b46dc527a4d48b6d} Improved Transformer v1.0 -13.49{366a88e51f6022d427be3edc93068e75077f25f91c3a2483b46dc527a4d48b6d} 8.87{366a88e51f6022d427be3edc93068e75077f25f91c3a2483b46dc527a4d48b6d}

The offline consequence reveals us that even with the vanilla transformer and solely pin embeddings, the efficiency is already higher than different architectures. The improved transformer structure confirmed very robust offline outcomes: +8.87{366a88e51f6022d427be3edc93068e75077f25f91c3a2483b46dc527a4d48b6d} offline repin and a -13.49{366a88e51f6022d427be3edc93068e75077f25f91c3a2483b46dc527a4d48b6d} disguise drop. The achieve of improved transformer 1.0 from vanilla transformer got here from a number of features:

  1. Utilizing motion embedding: this helps mannequin distinguish constructive and unfavourable engagement
  2. Early fusion of candidate pin and person sequence: this contributes to nearly all of engagement achieve, based on on-line and offline experiment,
  3. Random time window masks: helps with range

On-line Analysis

Then we performed a web based A/B experiment on 1.5{366a88e51f6022d427be3edc93068e75077f25f91c3a2483b46dc527a4d48b6d} of the overall visitors with the improved transformer mannequin v1.0. Through the on-line experiment, we noticed that the repin quantity for total customers elevated by 6{366a88e51f6022d427be3edc93068e75077f25f91c3a2483b46dc527a4d48b6d}. We outline the set of latest, informal, and resurrected customers as non-core customers. And we noticed that the repin quantity achieve on non-core customers can attain 11{366a88e51f6022d427be3edc93068e75077f25f91c3a2483b46dc527a4d48b6d}. Aligning with offline analysis, the disguise quantity was decreased by 10{366a88e51f6022d427be3edc93068e75077f25f91c3a2483b46dc527a4d48b6d}.

Just lately, we tried transformer mannequin v1.1 as illustrated in Determine 4, and we achieved an extra 5{366a88e51f6022d427be3edc93068e75077f25f91c3a2483b46dc527a4d48b6d} repin achieve on high of the v1.0 mannequin. Cover quantity stays impartial for v1.0.

Headings: Model Variation, Cumulative Homefeed Repin Volume (all users & non-core users) Cumulative Homefeed Hide Volume (all users).  Sequence Model V1.0 6{366a88e51f6022d427be3edc93068e75077f25f91c3a2483b46dc527a4d48b6d} 10{366a88e51f6022d427be3edc93068e75077f25f91c3a2483b46dc527a4d48b6d} -10{366a88e51f6022d427be3edc93068e75077f25f91c3a2483b46dc527a4d48b6d} Sequence Model V1.1 + Feature Crossing 11{366a88e51f6022d427be3edc93068e75077f25f91c3a2483b46dc527a4d48b6d} 17{366a88e51f6022d427be3edc93068e75077f25f91c3a2483b46dc527a4d48b6d} -10{366a88e51f6022d427be3edc93068e75077f25f91c3a2483b46dc527a4d48b6d}

Manufacturing Metrics (Full Visitors)

We need to name out an fascinating statement: the net experiment underestimates the ability of realtime person motion sequence. We noticed larger achieve once we rolled out the mannequin because the manufacturing Homefeed rating mannequin to full visitors. It is because the educational impact of constructive suggestions loop:

  1. As customers see a extra responsive Homefeed, they have an inclination to interact with extra related content material, and their habits modified (for instance, extra clicks or repins)
  2. With this habits change, the realtime person sequence that logs their habits in realtime additionally shifted. For instance, there are extra repin actions within the sequence. Then we generate the coaching knowledge with this shifted person sequence characteristic.
  3. As we retrain the Homefeed rating mannequin with this shifted dataset, there’s a constructive compounding impact that makes the retrained mannequin extra highly effective, thus, the next engagement charge. This then loops us again to 1.
Diagram of feedback loop of Realtime Sequence Model: 1. User behavior change: User’s behavior changed as they see more responsive recommendations Leads to 2. Training data change -User action sequence feature itself changed — More repin actions in training data then leads to 3. Ranking model improved — model is retrained on latest dataset — predicts user action more accurately — higher engagement Then loop back to 1

The precise Homefeed repin quantity enhance that we noticed after delivery this mannequin to manufacturing is larger than on-line experiment outcomes. Nonetheless, we won’t disclose the precise quantity on this weblog.

Our work to make use of realtime person motion indicators in Pinterest’s Homefeed recommender system has enormously improved the Homefeed relevance. Transformer structure seems to work finest amongst different conventional sequence modeling approaches. There have been numerous challenges alongside the best way and are non-trivial to sort out. We found that retraining the mannequin with realtime sequence is essential to maintain up the person engagement. And that GPU serving is indispensable for big scale, complicated fashions.

It’s thrilling to see the large achieve from this work, however what’s extra thrilling is that we all know there’s nonetheless rather more room to enhance. To proceed bettering Pinner expertise, we’ll work on the next features:

  1. Characteristic Enchancment: We plan to develop a extra fine-grained realtime sequence sign that features extra motion varieties and motion metadata.
  2. GPU Serving Optimization: That is the primary use case to make use of GPU clusters to serve massive fashions at natural scale. We plan to enhance GPU serving usability and efficiency.
  3. Mannequin Iteration: We are going to proceed engaged on the mannequin iteration in order that we totally make the most of the realtime sign.
  4. Adoption on Different Surfaces: We’ll strive related concepts in different surfaces: associated pins, notifications, search, and so forth.

This work is a results of collaboration throughout a number of groups at Pinterest. Many because of the next folks that contributed to this mission:

  • GPU serving optimization: Po-Wei Wang, Pong Eksombatchai, Nazanin Farahpour, Zhiyuan Zhang, Saurabh Joshi, Li Tang
  • Technical help on ML: Nikil Pancha
  • Sign technology and serving: Yitong Zhou
  • Quick controllability distribution convergence: Ludek Cigler

To be taught extra about engineering at Pinterest, take a look at the remainder of our Engineering Weblog and go to our Pinterest Labs website. To discover life at Pinterest, go to our Careers web page.