Vertical collaborative learning system also known as vertical federated
learning (VFL) system has recently become prominent as a concept to process
data distributed across many individual sources without the need to centralize
it. Multiple participants collaboratively train models based on their local
data in a privacy-preserving manner. To date, VFL has become a de facto
solution to securely learn a model among organizations, allowing knowledge to
be shared without compromising privacy of any individual organizations.
Despite the prosperous development of VFL systems, we find that certain
inputs of a participant, named adversarial dominating inputs (ADIs), can
dominate the joint inference towards the direction of the adversary’s will and
force other (victim) participants to make negligible contributions, losing
rewards that are usually offered regarding the importance of their
contributions in collaborative learning scenarios.
We conduct a systematic study on ADIs by first proving their existence in
typical VFL systems. We then propose gradient-based methods to synthesize ADIs
of various formats and exploit common VFL systems. We further launch greybox
fuzz testing, guided by the resiliency score of “victim” participants, to
perturb adversary-controlled inputs and systematically explore the VFL attack
surface in a privacy-preserving manner. We conduct an in-depth study on the
influence of critical parameters and settings in synthesizing ADIs. Our study
reveals new VFL attack opportunities, promoting the identification of unknown
threats before breaches and building more secure VFL systems.