BEBA Behavioural Based Forwarding Deliverable

BEBA
Behavioural Based
Forwarding Grant Agreement: 644122
Beba
BEhavioural BAsed forwarding
BEBA
Behavioural Based
Forwarding
Deliverable Report
D2.4 Extended BEBA abstraction proof of concept prototype
Deliverable title
Version
Due
date
of
deliverable
(month)
Actual submission date of the
deliverable (dd/mm/yyyy)
Start
date
of
project
(dd/mm/yyyy)
Duration of the project
Work Package
Task
Leader for this deliverable
Other contributing partners
Authors
Deliverable reviewer(s)
BEBA/WP2 – D2.4 Extended
prototype
1.0
BEBA
abstraction
proof
of
concept
September 2016
11/10/2016
01/01/2015
27 months
WP2
T2.3
CNIT
CESNET
Marco Bonola, Valerio Bruschi, Davide Sanvito,
Salvatore Pontarelli, Giuseppe Bianchi, Carmelo
Cascone, Antonio Capone (CNIT), Viktor Pus, Libor
Polcak, Pavel Benacek (CESNET)
Marco Bonola
Version: 1.0 Page 1 of 49 Beba
BEBA
Behavioural Based
Forwarding Grant Agreement: 644122
BEhavioural BAsed forwarding
Project co-funded by the European Commission within the Horizon 2020
(H2020) Programme
DISSEMINATION LEVEL
PU
PP
RE
CO
Public
Restricted to other programme participants (including the Commission
Services)
Restricted to a group specified by the consortium (including the
Commission Services)
Confidential, only for members of the consortium (including the
Commission Services)
Revision
0.1
Date
26/08/2016
0.2
0.3
29/08/2016
04/09/2016
0.5
20/09/2016
0.6
29/19/2016
1.0
07/10/2016
REVISION HISTORY
Author
Organisation
Salvatore Pontarelli,
CNIT
Giuseppe Bianchi
Valerio Bruschi
Davide Sanvito,
Carmelo Cascone
Viktor Pus, Libor
Polcak, Pavel
Benacek
Salvatore Pontarelli,
Marco Bonola
Marco Bonola
X
Description
First draft of section
2.1, 2.2
CNIT
CNIT
Section 2.3 added
Section 1 added
6WIND
Section 3 added
CNIT
Section 2.1 and 2.2
finalized
CNIT
Final review
PROPRIETARY RIGHTS STATEMENT
This document contains information, which is proprietary to the BEBA consortium. Neither this document nor the
information contained herein shall be used, duplicated or communicated by any means to any third party, in whole or
in parts, except with the prior written consent of the BEBA consortium. This restriction legend shall not be altered or
obliterated on or from this document.
STATEMENT OF ORIGINALITY
This deliverable contains original unpublished work except where clearly indicated otherwise. Acknowledgement of
previously published material and of the work of others has been made through appropriate citation, quotation or both
BEBA/WP2 – D2.4 Version: 1.0 Page 2 of 49 BEBA
Behavioural Based
Forwarding Grant Agreement: 644122
Beba
BEhavioural BAsed forwarding
TABLE OF CONTENT
EXECUTIVE SUMMARY ................................................................................................ 5 1. EXTENDED BEBA SOFTWARE PROOF OF CONCEPT PROTOTYPE ............................ 6 1.1 EXTENDED BEBA OFSOFTSWITCH13 ........................................................................... 6 1.1.1 SW implementation details ......................................................................... 6 1.1.2 Packing/unpacking of messages from the controller ....................................... 8 1.1.3 Processing of messages from the controller ................................................ 13 1.1.4 Data packet processing ............................................................................ 15 1.2 EXTENDED BEBA BASIC CONTROLLER PROTOTYPE .......................................................... 23 1.2.1 1.3 2 Implementation of the BEBA extensions ..................................................... 23 STATE SYNCHRONIZATION MECHANISM ....................................................................... 29 FPGA-BASED HARDWARE PROOF OF CONCEPT PROTOTYPE ............................... 30 2.1 OPP PROTOTYPE ARCHITECTURE DESIGN ..................................................................... 30 2.1.1 Packet reception and header field extraction ............................................... 31 2.1.2 Flow Context Table .................................................................................. 31 2.1.3 Condition Logic Block ............................................................................... 32 2.1.4 XFSM Table. ........................................................................................... 32 2.1.5 Update Logic Block. ................................................................................. 32 2.1.6 Miscellaneous Blocks and Microcontroller .................................................... 34 2.1.7 Direct memory access via PCIe ................................................................. 35 2.2 FPGA PROTOTYPE AND INITIAL EVALUATION OF ASIC IMPLEMENTATION ................................ 35 2.2.1 FPGA prototype ...................................................................................... 35 2.2.2 Performance achievable with an ASIC implementation .................................. 36 2.3 APPLICATION DEMO ............................................................................................. 37 BEBA/WP2 – D2.4 Version: 1.0 Page 3 of 49 BEBA
Behavioural Based
Forwarding Grant Agreement: 644122
3 Beba
BEhavioural BAsed forwarding
PRELIMINARY WORK ON HW-BASED FLEXIBLE PACKET PARSING ..................... 41 3.1.1 Competitive solutions .............................................................................. 41 3.1.2 Compiling P4 to FPGA .............................................................................. 42 3.1.3 Transformation algorithm ......................................................................... 44 3.1.4 Reached results ...................................................................................... 46 BEBA/WP2 – D2.4 Version: 1.0 Page 4 of 49 BEBA
Behavioural Based
Forwarding Grant Agreement: 644122
Beba
BEhavioural BAsed forwarding
Executive summary This deliverable documents the projects results related to activity T2.3 “Proof of concept
validation”. In particular, the deliverable describes the proof of concept prototype for the
extended (XFSM) BEBA abstraction. Section 1 presents the software proof of concept. Section
2 describes the FPGA based implementation while section 3 shows the flexible packet parser
implemented in FPGA.
BEBA/WP2 – D2.4 Version: 1.0 Page 5 of 49 BEBA
Behavioural Based
Forwarding Grant Agreement: 644122
Beba
BEhavioural BAsed forwarding
1. Extended BEBA Software proof of concept prototype To experimentally validate the extended BEBA abstraction, we developed a prototype software
implementation. This work further extends the BEBA software proof of concept presented in
Deliverable D2.2 and thus is based on the ofsoftswitch13 [1] software switch.
As for the BEBA controller, we started with our modified version of the OpenFlow controller
Ryu [2] and added the support for the new required messages according to the section 3 of
Deliverable D2.3.
1.1 Extended BEBA ofsoftswitch13 As already stated in Deliverable D2.2, the support of basic BEBA has been realized in
ofsoftswitch13 as an OpenFlow experimenter extension. The OpenFlow specification defines the
common structure of experimental match fields, messages and actions, then each vendor can
customize the format of each structure. The experimenter_id adopted for identifying all the
protocols additions for the extended BEBA is still 0xBEBABEBA.
Extended BEBA’s additional features have been built upon the basic BEBA software proof of
concept and therefore in the following section we are going to focus on the new code portions.
Furthermore, since the definitions of the new experimenter match fields, messages and actions
have been extensively presented in section 3 of D2.3, we are going to neglect the code
sections containing just the data structures.
In particular, we will present all the operations the switch needs to perform in order to support
an application similar to the “Simple Toy example” of 3.4 of the Deliverable D2.3: we want to
limit the maximum amount of bytes sent from a given MAC address. The code of the
application will be presented in the following sections. For this application we are interested in
being able to:
1: keep a per-source-MAC memory
2: count the amount of byte sent, by means of arithmetic operations between a packet header
field and a flow data variable
3: configure conditions
4: evaluate those conditions and take a forwarding decision based on the result of such
evaluations
1.1.1 SW implementation details In Figure 1 is depicted the packet processing workflow, where the portions which have been
modified with respect to the basic BEBA are highlighted in red, green, blue and yellow.
BEBA/WP2 – D2.4 Version: 1.0 Page 6 of 49 BEBA
Behavioural Based
Forwarding Grant Agreement: 644122
Beba
BEhavioural BAsed forwarding
Figure 1 - BEBA basic forwarding abstraction prototype workflow
BEBA/WP2 – D2.4 Version: 1.0 Page 7 of 49 BEBA
Behavioural Based
Forwarding Grant Agreement: 644122
Beba
BEhavioural BAsed forwarding
1.1.2 Packing/unpacking of messages from the controller The configurations and modifications of the flow context table are performed with State
Modification messages sent by the controller. This message is an experimenter message with
exp_type field set to OFPT_EXP_STATE_MOD. In addition to the standard BEBA message for
enabling the statefulness of a stage and for configuring the lookup/update scope fields, four
new commands have been added in order to configure header field extractors, conditions,
global data variables and flow data variables. The following code shows the additions of the
callback function responsible of the translation from the wire format to the internal switch
format.
ofl_err
ofl_exp_beba_msg_unpack(struct ofp_header const *oh, size_t *len, struct ofl_msg_experimenter **msg, struct
ofl_exp const *exp)
{
[…]
switch (ntohl(exp_header->exp_type)) {
case (OFPT_EXP_STATE_MOD):
{
[…]
switch (dm->command){
[…]
case OFPSC_EXP_SET_HEADER_FIELD_EXTRACTOR:
return ofl_structs_set_header_field_unpack((struct ofp_exp_set_header_field_extractor
const *)&(sm->payload[0]), len,
(struct ofl_exp_set_header_field_extractor *)&(dm>payload[0]));
case OFPSC_EXP_SET_CONDITION:
return ofl_structs_set_condition_unpack((struct ofp_exp_set_condition const *)&(sm>payload[0]), len,
(struct ofl_exp_set_condition *)&(dm->payload[0]));
case OFPSC_EXP_SET_GLOBAL_DATA_VAR:
return ofl_structs_set_global_data_var_unpack((struct ofp_exp_set_global_data_variable
const *)&(sm->payload[0]), len,
(struct ofl_exp_set_global_data_variable *)&(dm>payload[0]));
case OFPSC_EXP_SET_FLOW_DATA_VAR:
return ofl_structs_set_flow_data_var_unpack((struct ofp_exp_set_flow_data_variable
const *)&(sm->payload[0]), len,
(struct ofl_exp_set_flow_data_variable *)&(dm>payload[0]));
default:
return ofl_error(OFPET_EXPERIMENTER, OFPEC_EXP_STATE_MOD_BAD_COMMAND);
}
}
[…]
}
Flow modification messages, sent from the controller during flow table population, might
contain BEBA experimenter actions. Two new actions have been introduced: the first action,
set-data-variable, allows to update the flow data variables or the global data variables, while
the second one, write-context-to-fields, permits to write into the packet the state, a flow data
variable or a global data variable. The following snippet shows the extension of the function
which translates the experimenter actions from the wire format to the internal one.
ofl_err
ofl_exp_beba_act_unpack(struct ofp_action_header const *src, size_t *len, struct ofl_action_header **dst)
{
[…]
switch (ntohl(ext->act_type)) {
[…]
case (OFPAT_EXP_SET_DATA_VAR):
{
BEBA/WP2 – D2.4 Version: 1.0 Page 8 of 49 BEBA
Behavioural Based
Forwarding Grant Agreement: 644122
Beba
BEhavioural BAsed forwarding
// At unpack time we do NOT check if stage is stateful and state table is configured: those
checks are run at action execution time
struct ofp_exp_action_set_data_variable *sa;
struct ofl_exp_action_set_data_variable *da;
int i;
uint16_t operand_types;
sa = (struct ofp_exp_action_set_data_variable *)ext;
da = (struct ofl_exp_action_set_data_variable *)malloc(sizeof(struct
ofl_exp_action_set_data_variable));
da->header.header.experimenter_id = ntohl(exp->experimenter);
da->header.act_type = ntohl(ext->act_type);
*dst = (struct ofl_action_header *)da;
if (*len < sizeof(struct ofp_exp_action_set_data_variable) + ROUND_UP(sizeof(uint32_t)*(ntohl(sa>field_count)),8) ) {
OFL_LOG_WARN(LOG_MODULE, "Received SET DATA VAR action has invalid length (%zu)", *len);
return ofl_error(OFPET_EXPERIMENTER, OFPEC_BAD_EXP_LEN);
}
if (sa->table_id >= PIPELINE_TABLES) {
if (OFL_LOG_IS_WARN_ENABLED(LOG_MODULE)) {
char *ts = ofl_table_to_string(sa->table_id);
OFL_LOG_WARN(LOG_MODULE, "Received SET DATA VAR action has invalid table_id (%s).", ts);
free(ts);
}
return ofl_error(OFPET_EXPERIMENTER, OFPEC_BAD_TABLE_ID);
}
operand_types = ntohs(sa->operand_types);
// operand_types=aabbccdde0000000 where aa=operand_1_type, bb=operand_2_type, cc=operand_3_type,
dd=operand_4_type and e=output_type
// operand_1 can be FLOW_DATA_VAR, GLOBAL_DATA_VAR or HEADER_FIELD
error = check_operands((operand_types>>14)&3,sa->operand_1,"operand_1",false,true);
if (error)
return error;
// operand_2 can be FLOW_DATA_VAR, GLOBAL_DATA_VAR, HEADER_FIELD or CONSTANT
error = check_operands((operand_types>>12)&3,sa->operand_2,"operand_2",true,true);
if (error)
return error;
// output can be FLOW_DATA_VAR or GLOBAL_DATA_VAR
error = check_operands((operand_types>>7)&1,sa->output,"output",false,false);
if (error)
return error;
if (sa->opcode>OPCODE_POLY_SUM){
OFL_LOG_WARN(LOG_MODULE, "Received SET DATA VAR action has invalid opcode (%u).", sa->opcode
);
return ofl_error(OFPET_EXPERIMENTER, OFPEC_BAD_OPCODE);
}
if (sa->opcode==OPCODE_AVG || sa->opcode==OPCODE_VAR || sa->opcode==OPCODE_POLY_SUM){
// operand_3 can be FLOW_DATA_VAR, GLOBAL_DATA_VAR or HEADER_FIELD
error = check_operands((operand_types>>10)&3,sa->output,"operand_3",false,true);
if (error)
return error;
}
if (sa->opcode==OPCODE_POLY_SUM){
// operand_4 can be FLOW_DATA_VAR, GLOBAL_DATA_VAR or HEADER_FIELD
error = check_operands((operand_types>>8)&3,sa->output,"operand_4",false,true);
if (error)
return error;
}
da->table_id = sa->table_id;
da->operand_types = ntohs(sa->operand_types);
da->opcode = sa->opcode;
da->output = sa->output;
da->operand_1 = sa->operand_1;
da->operand_2 = sa->operand_2;
da->operand_3 = sa->operand_3;
da->operand_4 = sa->operand_4;
BEBA/WP2 – D2.4 Version: 1.0 Page 9 of 49 BEBA
Behavioural Based
Forwarding Grant Agreement: 644122
Beba
BEhavioural BAsed forwarding
da->coeff_1 = sa->coeff_1;
da->coeff_2 = sa->coeff_2;
da->coeff_3 = sa->coeff_3;
da->coeff_4 = sa->coeff_4;
da->field_count=ntohl(sa->field_count);
for (i=0;i<da->field_count;i++)
da->fields[i]=ntohl(sa->fields[i]);
*len -= sizeof(struct ofp_exp_action_set_data_variable) + ROUND_UP(sizeof(uint32_t)*(ntohl(sa>field_count)),8);
break;
}
case (OFPAT_EXP_WRITE_CONTEXT_TO_FIELD):
{
struct ofp_exp_action_write_context_to_field *sa;
struct ofl_exp_action_write_context_to_field *da;
sa = (struct ofp_exp_action_write_context_to_field *)ext;
da = (struct ofl_exp_action_write_context_to_field *)malloc(sizeof(struct
ofl_exp_action_write_context_to_field));
da->header.header.experimenter_id = ntohl(exp->experimenter);
da->header.act_type = ntohl(ext->act_type);
*dst = (struct ofl_action_header *)da;
if (*len < sizeof(struct ofp_exp_action_write_context_to_field)) {
OFL_LOG_WARN(LOG_MODULE, "Received WRITE CONTEXT TO FIELD action has invalid length (%zu).",
*len);
return ofl_error(OFPET_EXPERIMENTER, OFPEC_BAD_EXP_LEN);
}
if (sa->src_type > SOURCE_TYPE_STATE) {
if (OFL_LOG_IS_WARN_ENABLED(LOG_MODULE)) {
OFL_LOG_WARN(LOG_MODULE, "Received WRITE CONTEXT TO FIELD action has invalid src_type
(%u).", sa->src_type);
}
return ofl_error(OFPET_EXPERIMENTER, OFPEC_BAD_SOURCE_TYPE);
}
switch (sa->src_type){
case SOURCE_TYPE_FLOW_DATA_VAR:
if (sa->src_id >= OFPSC_MAX_FLOW_DATA_VAR_NUM){
OFL_LOG_WARN(LOG_MODULE, "Received WRITE CONTEXT TO FIELD action has invalid flow
data variable id (src_id) (%u).", sa->src_id );
return ofl_error(OFPET_EXPERIMENTER, OFPEC_BAD_FLOW_DATA_VAR_ID);
}
break;
case SOURCE_TYPE_GLOBAL_DATA_VAR:
if (sa->src_id >= OFPSC_MAX_GLOBAL_DATA_VAR_NUM){
OFL_LOG_WARN(LOG_MODULE, "Received WRITE CONTEXT TO FIELD action has invalid global
data variable id (src_id) (%u).", sa->src_id );
return ofl_error(OFPET_EXPERIMENTER, OFPEC_BAD_GLOBAL_DATA_VAR_ID);
}
break;
case SOURCE_TYPE_STATE:
sa->src_id = 0;
break;
}
da->src_type = sa->src_type;
da->src_id = sa->src_id;
da->dst_field = ntohl(sa->dst_field);
if(da->dst_field == OXM_OF_IN_PORT || da->dst_field == OXM_OF_IN_PHY_PORT
|| da->dst_field == OXM_OF_METADATA
|| da->dst_field == OXM_OF_IPV6_EXTHDR
|| da->dst_field == OXM_EXP_GLOBAL_STATE
|| da->dst_field == OXM_EXP_STATE
|| da->dst_field == OXM_EXP_CONDITION0
|| da->dst_field == OXM_EXP_CONDITION1
|| da->dst_field == OXM_EXP_CONDITION2
|| da->dst_field == OXM_EXP_CONDITION3
|| da->dst_field == OXM_EXP_CONDITION4
|| da->dst_field == OXM_EXP_CONDITION5
|| da->dst_field == OXM_EXP_CONDITION6
|| da->dst_field == OXM_EXP_CONDITION7
BEBA/WP2 – D2.4 Version: 1.0 Page 10 of 49 BEBA
Behavioural Based
Forwarding Grant Agreement: 644122
Beba
BEhavioural BAsed forwarding
|| da->dst_field == OXM_EXP_TIMESTAMP
|| da->dst_field == OXM_EXP_RANDOM
|| da->dst_field == OXM_EXP_PKT_LEN){
return ofl_error(OFPET_BAD_ACTION, OFPBAC_BAD_SET_TYPE);
break;
}
*len -= sizeof(struct ofp_exp_action_write_context_to_field);
break;
}
Conditions are defined by two operands and an operation and are configured by the controller
with a State Modification message with command set to OFPSC_EXP_SET_CONDITION. The
unpacking operation converts the message to the internal representation and check the
validity of the operands and of the operation.
ofl_err
ofl_structs_set_condition_unpack(struct ofp_exp_set_condition const *src, size_t *len, struct
ofl_exp_set_condition *dst) {
ofl_err error;
if(*len == sizeof(struct ofp_exp_set_condition)) {
if (src->table_id >= PIPELINE_TABLES) {
OFL_LOG_WARN(LOG_MODULE, "Received STATE_MOD message has invalid table id (%u).", src->table_id );
return ofl_error(OFPET_EXPERIMENTER, OFPEC_BAD_TABLE_ID);
}
if (src->condition_id >= OFPSC_MAX_CONDITIONS_NUM) {
OFL_LOG_WARN(LOG_MODULE, "Received STATE_MOD message has invalid condition id (%u).", src>condition_id );
return ofl_error(OFPET_EXPERIMENTER, OFPEC_BAD_CONDITION_ID);
}
if (src->condition > 5) {
OFL_LOG_WARN(LOG_MODULE, "Received STATE_MOD message has invalid condition (%u).", src->condition );
return ofl_error(OFPET_EXPERIMENTER, OFPEC_BAD_CONDITION);
}
// operand_types=xxyy0000 where xx=operand_1_type and yy=operand_2_type
// operand_1 can be FLOW_DATA_VAR, GLOBAL_DATA_VAR or HEADER_FIELD
error = check_operands((src->operand_types>>6)&3,src->operand_1,"operand_1",false,true);
if (error)
return error;
// operand_2 can be FLOW_DATA_VAR, GLOBAL_DATA_VAR or HEADER_FIELD
error = check_operands((src->operand_types>>4)&3,src->operand_2,"operand_21",false,true);
if (error)
return error;
dst->table_id = src->table_id;
dst->condition_id = src->condition_id;
dst->condition = src->condition;
dst->operand_types = src->operand_types;
dst->operand_1 = src->operand_1;
dst->operand_2 = src->operand_2;
}
else {
//check of struct ofp_exp_set_condition length.
OFL_LOG_WARN(LOG_MODULE, "Received STATE_MOD set_condition has invalid length (%zu).", *len);
return ofl_error(OFPET_EXPERIMENTER, OFPEC_BAD_EXP_LEN);
}
*len -= sizeof(struct ofp_exp_set_condition);
return 0;
}
BEBA/WP2 – D2.4 Version: 1.0 Page 11 of 49 BEBA
Behavioural Based
Forwarding Grant Agreement: 644122
Beba
BEhavioural BAsed forwarding
Conditions’ operands can be flow data variables, global data variables or packet header fields.
In our example application we are interested in checking if the amount of bytes exceeds a
given threshold: the latter value can be stored in a global data variable and configured with a
State Modification message with command set to OFPSC_EXP_SET_GLOBAL_DATA_VAR.
static ofl_err
ofl_structs_set_global_state_unpack(struct ofp_exp_set_global_state const *src, size_t *len, struct
ofl_exp_set_global_state *dst)
{
if (*len == 2*sizeof(uint32_t)) {
dst->global_state = ntohl(src->global_state);
dst->global_state_mask = ntohl(src->global_state_mask);
}
else {
OFL_LOG_WARN(LOG_MODULE, "Received STATE_MOD set global state has invalid length (%zu).", *len);
return ofl_error(OFPET_EXPERIMENTER, OFPEC_BAD_EXP_LEN);
}
*len -= sizeof(struct ofp_exp_set_global_state);
return 0;
}
If a condition or a set-data-variable action takes into account a packet header field, we need to
configure a header field extractor with a State Modification message with command set to
OFPSC_EXP_SET_HEADER_FIELD_EXTRACTOR.
static ofl_err
ofl_structs_set_header_field_unpack(struct ofp_exp_set_header_field_extractor const *src, size_t *len, struct
ofl_exp_set_header_field_extractor *dst) {
if(*len == sizeof(struct ofp_exp_set_header_field_extractor)){
if (src->table_id >= PIPELINE_TABLES) {
OFL_LOG_WARN(LOG_MODULE, "Received STATE_MOD message has invalid table id (%u).", src->table_id );
return ofl_error(OFPET_EXPERIMENTER, OFPEC_BAD_TABLE_ID);
}
if (src->extractor_id >= OFPSC_MAX_HEADER_FIELDS) {
OFL_LOG_WARN(LOG_MODULE, "Received STATE_MOD message has invalid extractor id (%u).", src>extractor_id );
return ofl_error(OFPET_EXPERIMENTER, OFPEC_BAD_EXTRACTOR_ID);
}
// header field extractor should be a field <=32 bit
if ((OXM_VENDOR(ntohl(src->field))==0xFFFF && OXM_LENGTH(ntohl(src->field))-EXP_ID_LEN > 4) ||
(OXM_VENDOR(ntohl(src->field))!=0xFFFF && OXM_LENGTH(ntohl(src->field)) > 4)) {
OFL_LOG_WARN(LOG_MODULE, "Received STATE_MOD message has invalid header field size (%u).",
OXM_LENGTH(ntohl(src->field)));
return ofl_error(OFPET_EXPERIMENTER, OFPEC_BAD_HEADER_FIELD_SIZE);
}
dst->table_id = src->table_id;
dst->extractor_id = src->extractor_id;
dst->field = ntohl(src->field);
}
else {
//check of struct ofp_exp_set_header_field_extractor length.
OFL_LOG_WARN(LOG_MODULE, "Received STATE_MOD set_header_field has invalid length (%zu).", *len);
return ofl_error(OFPET_EXPERIMENTER, OFPEC_BAD_EXP_LEN);
}
*len -= sizeof(struct ofp_exp_set_header_field_extractor);
return 0;
}
BEBA/WP2 – D2.4 Version: 1.0 Page 12 of 49 BEBA
Behavioural Based
Forwarding Grant Agreement: 644122
Beba
BEhavioural BAsed forwarding
1.1.3 Processing of messages from the controller All the code snippets seen so far, implemented in the blocks highlighted in red in Figure 1, take
care of the unpacking operations. After this phase, the switch needs to apply the received
messages: all these operations (green box in Figure 1) are handled by the
handle_state_mod() function.
ofl_err
handle_state_mod(struct pipeline *pl, struct ofl_exp_msg_state_mod *msg,
const struct sender *sender UNUSED, struct ofl_exp_msg_notify_state_change * ntf_message) {
switch (msg->command){
[…]
case OFPSC_EXP_SET_HEADER_FIELD_EXTRACTOR:{
struct ofl_exp_set_header_field_extractor *p = (struct ofl_exp_set_header_field_extractor *) msg>payload;
struct state_table *st = pl->tables[p->table_id]->state_table;
if (state_table_is_stateful(st)){
return state_table_set_header_field_extractor(st, p);
}
else{
OFL_LOG_WARN(LOG_MODULE, "ERROR STATE MOD at stage %u: stage not stateful", p->table_id);
return ofl_error(OFPET_EXPERIMENTER, OFPEC_BAD_HEADER_EXTRACTOR);
}
break;}
case OFPSC_EXP_SET_CONDITION:{
struct ofl_exp_set_condition *p = (struct ofl_exp_set_condition *) msg->payload;
struct state_table *st = pl->tables[p->table_id]->state_table;
if (state_table_is_stateful(st)){
return state_table_set_condition(st, p);
}
else{
OFL_LOG_WARN(LOG_MODULE, "ERROR STATE MOD at stage %u: stage not stateful", p->table_id);
return ofl_error(OFPET_EXPERIMENTER, OFPEC_BAD_CONDITION);
}
break;}
case OFPSC_EXP_SET_GLOBAL_DATA_VAR:{
struct ofl_exp_set_global_data_variable *p = (struct ofl_exp_set_global_data_variable *) msg->payload;
struct state_table *st = pl->tables[p->table_id]->state_table;
if (state_table_is_stateful(st)){
uint32_t global_data_var = st->global_data_var[p->global_data_variable_id];
global_data_var = (global_data_var & ~(p->mask)) | (p->value & p->mask);
st->global_data_var[p->global_data_variable_id] = global_data_var;
OFL_LOG_DBG(LOG_MODULE, "Global data variable %u configured to value %d",p>global_data_variable_id,st->global_data_var[p->global_data_variable_id]);
}
else{
OFL_LOG_WARN(LOG_MODULE, "ERROR STATE MOD at stage %u: stage not stateful", p->table_id);
return ofl_error(OFPET_EXPERIMENTER, OFPEC_BAD_CONDITION);
}
break;}
case OFPSC_EXP_SET_FLOW_DATA_VAR:{
struct ofl_exp_set_flow_data_variable *p = (struct ofl_exp_set_flow_data_variable *) msg->payload;
struct state_table *st = pl->tables[p->table_id]->state_table;
if (state_table_is_stateful(st) && state_table_is_configured(st)){
return state_table_set_flow_data_variable(st, p);
}
else{
OFL_LOG_WARN(LOG_MODULE, "ERROR STATE MOD at stage %u: stage not stateful or not configured", p>table_id);
return ofl_error(OFPET_EXPERIMENTER, OFPEC_EXP_SET_FLOW_STATE);
}
break;}
default:
return ofl_error(OFPET_EXPERIMENTER, OFPEC_EXP_STATE_MOD_FAILED);
}
ofl_msg_free((struct ofl_msg_header *)msg, pl->dp->exp);
return 0;
}
BEBA/WP2 – D2.4 Version: 1.0 Page 13 of 49 BEBA
Behavioural Based
Forwarding Grant Agreement: 644122
Beba
BEhavioural BAsed forwarding
Any of the above messages will fail if the stage has not been configured as stateful. The last
one, with OFPSC_EXP_SET_FLOW_DATA_VAR command, requires also the flow context table to
have the lookup/update scopes already configured.
ofl_err state_table_set_header_field_extractor(struct state_table *table, struct
ofl_exp_set_header_field_extractor *hfe) {
struct key_extractor *dest;
dest = &table->header_field_extractor[hfe->extractor_id];
dest->field_count = 1;
dest->fields[0] = hfe->field;
OFL_LOG_DBG(LOG_MODULE, "Header field extractor %u configured",hfe->extractor_id);
return 0;
}
ofl_err state_table_set_condition(struct state_table *table, struct ofl_exp_set_condition *p) {
struct condition_table_entry* cte = (struct condition_table_entry*) malloc(sizeof(struct
condition_table_entry));
cte->condition = p->condition;
cte->operand_1_type = (p->operand_types>>6)&3;
cte->operand_2_type = (p->operand_types>>4)&3;
cte->operand_1 = p->operand_1;
cte->operand_2 = p->operand_2;
table->condition_table[p->condition_id] = cte;
OFL_LOG_DBG(LOG_MODULE, "Condition %u configured",p->condition_id);
return 0;
}
ofl_err state_table_set_flow_data_variable(struct state_table *table, struct ofl_exp_set_flow_data_variable *p) {
uint8_t key[OFPSC_MAX_KEY_LEN] = {0};
struct state_entry *e;
uint64_t now;
struct timeval tv;
uint32_t key_len = compute_key_len(&table->write_key); //update-scope key extractor length
if(key_len == p->key_len)
{
memcpy(key, p->key, p->key_len);
}
else
{
OFL_LOG_WARN(LOG_MODULE, "key extractor length != received key length");
return ofl_error(OFPET_EXPERIMENTER, OFPEC_BAD_EXP_LEN);
}
HMAP_FOR_EACH_WITH_HASH(e, struct state_entry,
hmap_node, hash_bytes(key, OFPSC_MAX_KEY_LEN, 0), &table->state_entries){
if (!memcmp(key, e->key, OFPSC_MAX_KEY_LEN)){
OFL_LOG_DBG(LOG_MODULE, "Set flow data variable: state entry found. Updating
flow_data_var[%d]=%d",p->flow_data_variable_id,(e->flow_data_var[p->flow_data_variable_id] & ~(p->mask)) | (p>value & p->mask));
e->flow_data_var[p->flow_data_variable_id] = (e->flow_data_var[p->flow_data_variable_id] & ~(p>mask)) | (p->value & p->mask);
return 0;
}
}
gettimeofday(&tv,NULL);
now = 1000000 * tv.tv_sec + tv.tv_usec;
e = xmalloc(sizeof(struct state_entry));
memset(e,0,sizeof(struct state_entry));
e->created = now;
e->stats = xmalloc(sizeof(struct ofl_exp_state_stats));
memset(e->stats,0,sizeof(struct ofl_exp_state_stats));
memcpy(e->key, key, OFPSC_MAX_KEY_LEN);
e->state = STATE_DEFAULT;
e->flow_data_var[p->flow_data_variable_id] = p->value & p->mask;
hmap_insert(&table->state_entries, &e->hmap_node, hash_bytes(key, OFPSC_MAX_KEY_LEN, 0));
OFL_LOG_DBG(LOG_MODULE, "Set flow data variable: state entry not found! A new state entry will be created with
flow_data_var[%d]=%d",p->flow_data_variable_id,e->flow_data_var[p->flow_data_variable_id]);
return 0;
}
BEBA/WP2 – D2.4 Version: 1.0 Page 14 of 49 BEBA
Behavioural Based
Forwarding Grant Agreement: 644122
Beba
BEhavioural BAsed forwarding
1.1.4 Data packet processing Once all the flow entries have been installed and conditions and header field extractor have
been configured, traffic can start flowing into the network. We are now going to describe
changes in packet processing performed by the switch (blue and yellow blocks in Figure 1).
Each packet is passed through the pipeline’s stages. Conditions and state are local to each
table, thus eventual metadata fields coming from a previous table in the pipeline should be
removed.
If the current stage is stateful, the standard BEBA state lookup is performed and ‘state’
information is appended to the packet. Then, before matching the packet against the flow
rules, all the configured conditions are evaluated and their results are appended to the packet
as a 8-bit metadata fields: the value will be 1 if the condition is met, 0 otherwise. If the
condition has not been configured or if it cannot be calculated (e.g. the header field configured
as one of the two operands is not part of the current packet) the corresponding metadata is
not added to packet, preventing a flow entry matching on such a condition to be selected.
Void pipeline_process_packet(struct pipeline *pl, struct packet *pkt){
[…]
while (next_table != NULL) {
struct flow_entry *entry;
struct state_entry *state_entry;
int condition_evaluation_result = 0;
VLOG_DBG_RL(LOG_MODULE, &rl, "trying table %u.", next_table->stats->table_id);
pkt->table_id = next_table->stats->table_id;
table
= next_table;
next_table
= NULL;
//removes eventual old 'state' virtual header field
HMAP_FOR_EACH_WITH_HASH(f, struct ofl_match_tlv,
hmap_node, hash_int(OXM_EXP_STATE,0), &pkt->handle_std->match.match_fields){
hmap_remove_and_shrink(&pkt->handle_std->match.match_fields,&f->hmap_node);
}
//removes eventual old 'condition' virtual header field
for (i=0;i<OFPSC_MAX_CONDITIONS_NUM;i++){
HMAP_FOR_EACH_WITH_HASH(f, struct ofl_match_tlv,hmap_node, hash_int(conditions_OXM_array[i],0), &pkt>handle_std->match.match_fields){
hmap_remove_and_shrink(&pkt->handle_std->match.match_fields,&f->hmap_node);
}
}
if (state_table_is_stateful(table->state_table)) {
//Conditions evaluation should be independent from extractors configuration, but stage must be
stateful
if (state_table_is_configured(table->state_table)) {
state_entry = state_table_lookup(table->state_table, pkt);
if(state_entry!=NULL){
ofl_structs_match_exp_put32(&pkt->handle_std->match, OXM_EXP_STATE, 0xBEBABEBA, 0x00000000);
state_table_write_state(state_entry, pkt);
}
}
//Conditions evaluation
for (i=0;i<OFPSC_MAX_CONDITIONS_NUM;i++){
if (table->state_table->condition_table[i]!=NULL) {
VLOG_DBG_RL(LOG_MODULE, &rl, "Evaluating condition %d.", i);
condition_evaluation_result = state_table_evaluate_condition(table->state_table, pkt, table>state_table->condition_table[i]);
if (condition_evaluation_result!=-1) {
VLOG_DBG_RL(LOG_MODULE, &rl, "result = %d.", condition_evaluation_result);
ofl_structs_match_exp_put8(&pkt->handle_std->match, conditions_OXM_array[i], 0xBEBABEBA,
condition_evaluation_result);
}
}
}
}
[…]
}
BEBA/WP2 – D2.4 Version: 1.0 Page 15 of 49 BEBA
Behavioural Based
Forwarding Grant Agreement: 644122
Beba
BEhavioural BAsed forwarding
Condition evaluation returns 1 or 0 if the condition can be calculated on the packet, -1
otherwise.
int state_table_evaluate_condition(struct state_table *state_table,struct packet *pkt,struct
condition_table_entry* condition_table_entry) {
//Comparison is made by converting fields value to integers. Header field extractors always refer to field of
length <=32 bit
uint32_t operand_1_value = 0;
uint32_t operand_2_value = 0;
if (!retrieve_operand(&operand_1_value, condition_table_entry->operand_1_type, condition_table_entry>operand_1, "condition_operand_1", state_table, pkt, &state_table->read_key))
return -1;
if (!retrieve_operand(&operand_2_value, condition_table_entry->operand_2_type, condition_table_entry>operand_2, "condition_operand_2", state_table, pkt, &state_table->read_key))
return -1;
switch(condition_table_entry->condition){
case CONDITION_GT:{
OFL_LOG_DBG(LOG_MODULE, "condition=CONDITION_GT");
return operand_1_value>operand_2_value;}
case CONDITION_LT:{
OFL_LOG_DBG(LOG_MODULE, "condition=CONDITION_LT");
return operand_1_value<operand_2_value;}
case CONDITION_GTE:{
OFL_LOG_DBG(LOG_MODULE, "condition=CONDITION_GTE");
return operand_1_value>=operand_2_value;}
case CONDITION_LTE:{
OFL_LOG_DBG(LOG_MODULE, "condition=CONDITION_LTE");
return operand_1_value<=operand_2_value;}
case CONDITION_EQ:{
OFL_LOG_DBG(LOG_MODULE, "condition=CONDITION_EQ");
return operand_1_value==operand_2_value;}
case CONDITION_NEQ:{
OFL_LOG_DBG(LOG_MODULE, "condition=CONDITION_NEQ");
return operand_1_value!=operand_2_value;}
default:{
return -1;}
}
return -1;
}
After the state lookup and conditions evaluation, the packet is eventually carrying metadata
fields for the state and the results of the conditions and can finally be matched against the flow
table.
With respect to the basic BEBA, there are two new stateful actions (yellow block in Figure 1):
the former, set-data-variable, allows to update the flow/global data variables, while the latter,
write-context-to-field, allows to write a flow/global data variable or the flow state into a packet
header field.
void
dp_exp_action(struct packet *pkt, struct ofl_action_experimenter *act) {
if(act->experimenter_id == BEBA_VENDOR_ID)
{
struct ofl_exp_beba_act_header *action;
struct ofl_exp_msg_notify_state_change ntf_message;
action = (struct ofl_exp_beba_act_header *) act;
switch(action->act_type){
[…]
case (OFPAT_EXP_SET_DATA_VAR):
{
struct ofl_exp_action_set_data_variable *act = (struct ofl_exp_action_set_data_variable *)action;
struct state_table *st = pkt->dp->pipeline->tables[act->table_id]->state_table;
BEBA/WP2 – D2.4 Version: 1.0 Page 16 of 49 BEBA
Behavioural Based
Forwarding Grant Agreement: 644122
Beba
BEhavioural BAsed forwarding
// Stage must be stateful, but we check if state_table_is_configured() only if output/operands are
flow data variables.
if (state_table_is_stateful(st)){
VLOG_DBG_RL(LOG_MODULE, &rl, "executing action SET DATA VAR at stage %u", act->table_id);
state_table_set_data_variable(st, act, pkt);
}
else{
VLOG_DBG_RL(LOG_MODULE, &rl, "ERROR SET DATA VAR at stage %u: stage not stateful", act>table_id);
}
break;
}
case (OFPAT_EXP_WRITE_CONTEXT_TO_FIELD):
{
struct ofl_exp_action_write_context_to_field *act = (struct ofl_exp_action_write_context_to_field
*)action;
struct state_table *st = pkt->dp->pipeline->tables[pkt->table_id]->state_table;
struct ofl_action_set_field* set_field_act;
if (state_table_is_stateful(st) && state_table_is_configured(st)){
VLOG_DBG_RL(LOG_MODULE, &rl, "executing action WRITE CONTEXT TO FIELD at stage %u", pkt>table_id);
set_field_act = (struct ofl_action_set_field*) state_table_write_context_to_field(st, act,
pkt);
if (set_field_act!=NULL){
dp_actions_set_field(pkt, set_field_act);
}
}
else{
VLOG_DBG_RL(LOG_MODULE, &rl, "WRITE CONTEXT TO FIELD at stage %u: stage not stateful", pkt>table_id);
}
break;
}
default:
VLOG_WARN_RL(LOG_MODULE, &rl, "Trying to execute unknown experimenter action (%u).", htonl(act>experimenter_id));
break;
}
[…]
}
}
The set-data-variable action allows to update the value of a flow/global data variable by means
of arithmetic operations between flow data variable, global data variable, packet header fields
and constants. In case the output of the action is a flow data variable, the flow whose data
variable has to be updated will be the one corresponding to the update scope. Also the current
value of the flow data variable, if needed by the update operation, will be read using the
update scope. For example, to perform an update action like FDV[0] = SUM(FDV[0],GDV[2]),
the switch will need to read the first flow data variable from the context of the flow extracted
with the update scope, add to it the value of the third global data variable and finally store the
result in the first flow data variable of the same flow.
Each operation has at least two operands and at least one output. For example SUM, SUB, MUL
and
DIV
requires
just
two
operands
and
produce
one
output.
AVG requires two operands and produces two outputs. VAR requires three operands and
produces three outputs. Finally POLY_SUM requires four operands and four coefficients and
produces just one output. The details of the operations performed can be found in the
following code.
BEBA/WP2 – D2.4 Version: 1.0 Page 17 of 49 BEBA
Behavioural Based
Forwarding Grant Agreement: 644122
Beba
BEhavioural BAsed forwarding
void state_table_set_data_variable(struct state_table *table, struct ofl_exp_action_set_data_variable *act, struct
packet *pkt) {
// At unpack time we have checked just operands IDs validity. Now, at action execution time, we need to check
if stage is
// stateful and state table is configured.
uint32_t result1 = 0;
uint32_t result2 = 0;
uint32_t result3 = 0;
uint32_t output_value = 0;
uint32_t operand_1_value = 0;
uint32_t operand_2_value = 0;
uint32_t operand_3_value = 0;
uint32_t operand_4_value = 0;
//coeff_x are signed integers!
int8_t coeff_1 = 0;
int8_t coeff_2 = 0;
int8_t coeff_3 = 0;
int8_t coeff_4 = 0;
uint8_t key[OFPSC_MAX_KEY_LEN] = {0}; //used to access state table
int i;
uint32_t key_len=0; //update-scope key extractor length
struct key_extractor *extractor=&table->write_key; //if not specified in the action, updates are done using
the preconfigured update-scope
struct key_extractor dummy_key_extract;
// If one of the operands or the output is a FLOW_DATA_VAR....
if ((((act->operand_types>>14)&3)==OPERAND_TYPE_FLOW_DATA_VAR) || (((act>operand_types>>12)&3)==OPERAND_TYPE_FLOW_DATA_VAR) || ((act->operand_types>>7)&1)==OPERAND_TYPE_FLOW_DATA_VAR
|| ( (act->opcode==OPCODE_AVG || act->opcode==OPCODE_VAR || | act->opcode==OPCODE_POLY_SUM) && (((act>operand_types>>10)&3)==OPERAND_TYPE_FLOW_DATA_VAR))
|| ( (act->opcode==OPCODE_POLY_SUM) && (((act->operand_types>>8)&3)==OPERAND_TYPE_FLOW_DATA_VAR)) ) {
// ...state table's extractors must have been configured...
if (!state_table_is_configured(table)){
OFL_LOG_DBG(LOG_MODULE, "State table's extractors are not configured!");
return;
}
// in case hardcoded update-key is specified in the action
if (act->field_count>0){
OFL_LOG_WARN(LOG_MODULE, "hardcoded update extractor");
if (act->field_count!=table->write_key.field_count){
OFL_LOG_WARN(LOG_MODULE, "key extractor length != hardcoded key length");
return;
}
dummy_key_extract.table_id = pkt->table_id;
dummy_key_extract.field_count = act->field_count;
for (i=0;i<act->field_count;i++)
dummy_key_extract.fields[i]=act->fields[i];
extractor = &dummy_key_extract;
}
key_len = compute_key_len(extractor); //update-scope key extractor length
if(!__extract_key(key, extractor, pkt)){
OFL_LOG_DBG(LOG_MODULE, "update key fields not found in the packet's header");
return;
}
}
// operand_types=aabbccdde0000000 where aa=operand_1_type, bb=operand_2_type, cc=operand_3_type,
dd=operand_4_type and e=output_type
if (!retrieve_operand(&operand_1_value, (act->operand_types>>14)&3, act->operand_1, "operand_1", table, pkt,
extractor))
return;
if (!retrieve_operand(&operand_2_value, (act->operand_types>>12)&3, act->operand_2, "operand_2", table, pkt,
extractor))
return;
// operand_3 is needed only by OPCODE_VAR, and OPCODE_POLY_SUM
if (act->opcode==OPCODE_VAR || act->opcode==OPCODE_POLY_SUM) {
if (!retrieve_operand(&operand_3_value, (act->operand_types>>10)&3, act->operand_3, "operand_3", table,
pkt, extractor))
BEBA/WP2 – D2.4 Version: 1.0 Page 18 of 49 BEBA
Behavioural Based
Forwarding Grant Agreement: 644122
Beba
BEhavioural BAsed forwarding
return;
}
// operand_4 and coeff_x are needed only by OPCODE_POLY_SUM
if (act->opcode==OPCODE_POLY_SUM){
if (!retrieve_operand(&operand_4_value, (act->operand_types>>8)&3, act->operand_4, "operand_4", table,
pkt, extractor))
return;
coeff_1
coeff_2
coeff_3
coeff_4
=
=
=
=
act->coeff_1;
act->coeff_2;
act->coeff_3;
act->coeff_4;
}
// OPCODE_AVG and OPCODE_VAR needs the current value of "output" operand
if (act->opcode==OPCODE_AVG || act->opcode==OPCODE_VAR){
if (!retrieve_operand(&output_value, (act->operand_types>>7)&1, act->output, "output", table, pkt,
extractor))
return;
}
// Calculate result(s)
switch(act->opcode){
case OPCODE_SUM:{
OFL_LOG_DBG(LOG_MODULE, "Executing OPCODE_SUM");
// sum( output , in1 , in2) = (OUT1 , IN1 , IN2) has 2 inputs and 1 output
// output = in1 + in2
result1 = operand_1_value + operand_2_value;
break;}
case OPCODE_SUB:{
OFL_LOG_DBG(LOG_MODULE, "Executing OPCODE_SUB");
// sub( output , in1 , in2) = (OUT1 , IN1 , IN2) has 2 inputs and 1 output
// output = in1 - in2
result1 = operand_1_value - operand_2_value;
break;}
case OPCODE_MUL:{
OFL_LOG_DBG(LOG_MODULE, "Executing OPCODE_MUL");
// mul( output , in1 , in2) = (OUT1 , IN1 , IN2) has 2 inputs and 1 output
// output = in1 * in2
result1 = operand_1_value * operand_2_value;
break;}
case OPCODE_DIV:{
OFL_LOG_DBG(LOG_MODULE, "Executing OPCODE_DIV");
// div( output , in1 , in2) = (OUT1 , IN1 , IN2) has 2 inputs and 1 output
// output = in1 / in2
if (operand_1_value==0)
result1 = 0;
else if (operand_2_value==0)
result1 = 0xffffffff;
else
result1 = operand_1_value / operand_2_value;
break;}
case OPCODE_AVG:{
OFL_LOG_DBG(LOG_MODULE, "Executing OPCODE_AVG");
// avg( [count] , [value_to_be_averaged] , [avg_value]) = (IO1 , IN1 , IO2) has 3 inputs and 2 outputs
// output1 = count
// output2 = avg(in1)*1000
// [count] = [count] + 1
// [avg_value] = ( [avg_value]*[count] + [value_to_be_averaged] ) / ( [count] + 1 )
result1 = output_value + 1;
// It should be
// result2 = ( (operand_2_value*output_value) + operand_1_value ) / (output_value + 1);
// but we'd like 3 decimal places
result2 = ( (operand_2_value*output_value) + operand_1_value*MULTIPLY_FACTOR ) / (output_value + 1);
break;}
case OPCODE_VAR:{
OFL_LOG_DBG(LOG_MODULE, "Executing OPCODE_VAR");
BEBA/WP2 – D2.4 Version: 1.0 Page 19 of 49 BEBA
Behavioural Based
Forwarding Grant Agreement: 644122
Beba
BEhavioural BAsed forwarding
// var( [count] , [value_to_be_varianced] , [avg_value] , [var_value]) = (IO1 , IN1 , IO2, IO3) has 4
inputs and 3 outputs
// output1 = count
// output2 = avg(in1)*1000
// output3 = var(in1)
// [count] = [count] + 1
// [avg_value] = ( [avg_value]*[count] + [value_to_be_averaged] ) / ( [count] + 1 )
// [var_value] = ( [var_value]*[count] + ([value_to_be_varianced] [avg_value])*([value_to_be_varianced] - [NEW_avg_value]) ) / ( [count] + 1 )
/*
When [count]=0, [var_value] would be set to [value_to_be_varianced]^2 because the HW would calculate
[avg_value] in parallel
with [var_value], so [avg_value] used to compute [var_value] would be still 0!
Thus, when the first sample is added, [var_value] must be 0!
*/
result1 = output_value + 1;
// It should be
// result2 = ( (operand_2_value*output_value) + operand_1_value ) / (output_value + 1);
// but we'd like 3 decimal places
result2 = ( (operand_2_value*output_value) + operand_1_value*MULTIPLY_FACTOR ) / (output_value + 1);
if (output_value==0)
result3 = 0;
else {
// As result2 is avg_value*1000, operand_2 and resultt2 needto be divided by 1000
result3 = (( (operand_3_value*output_value) + (operand_1_valueoperand_2_value/MULTIPLY_FACTOR)*(operand_1_value-result2/MULTIPLY_FACTOR) ) / (output_value + 1));
}
break;}
case OPCODE_POLY_SUM:{
OFL_LOG_DBG(LOG_MODULE, "Executing OPCODE_POLY_SUM");
// polysum( [count] , [value_to_be_varianced] , [avg_value] , [var_value]) = (OUT1 , IN1 , IN2, IN3,
COEFF1, COEFF2, COEFF3, COEFF4) has 8 inputs and 1 output
//output = operand_1_value*coeff_1 + operand_2_value*coeff_2 + operand_3_value*coeff_3 +
operand_4_value*coeff_4;
result1 = 0;
if (coeff_1<0)
result1 -= operand_1_value*abs(coeff_1);
else
result1 += operand_1_value*coeff_1;
if (coeff_2<0)
result1 -= operand_2_value*abs(coeff_2);
else
result1 += operand_2_value*coeff_2;
if (coeff_3<0)
result1 -= operand_3_value*abs(coeff_3);
else
result1 += operand_3_value*coeff_3;
if (coeff_4<0)
result1 -= operand_4_value*abs(coeff_4);
else
result1 += operand_4_value*coeff_4;
break;
}
default:{
OFL_LOG_DBG(LOG_MODULE, "SET DATA VAR action has invalid opcode (%u).", act->opcode );
return;}
}
// Write results to the corresponding output(s)
switch(act->opcode){
case OPCODE_SUM:
case OPCODE_SUB:
case OPCODE_MUL:
case OPCODE_DIV:
case OPCODE_POLY_SUM:{
//result1 is written in output
switch((act->operand_types>>7)&1){
BEBA/WP2 – D2.4 Version: 1.0 Page 20 of 49 BEBA
Behavioural Based
Forwarding Grant Agreement: 644122
Beba
BEhavioural BAsed forwarding
case OPERAND_TYPE_FLOW_DATA_VAR:{
struct ofl_exp_set_flow_data_variable p = (struct ofl_exp_set_flow_data_variable)
{.table_id = pkt->table_id,
.flow_data_variable_id = act->output,
.key_len = key_len,
.value = result1,
.mask = 0xFFFFFFFF,
.key = {}};
memcpy(p.key, key, key_len);
state_table_set_flow_data_variable(table,&p);
break;}
case OPERAND_TYPE_GLOBAL_DATA_VAR:{
table->global_data_var[act->output] = result1;
OFL_LOG_DBG(LOG_MODULE, "Global data variable %d updated to %"PRIu32,act->output,table>global_data_var[act->output]);
break;}
}
break;
}
case OPCODE_AVG:{
//result1 is written in output
switch((act->operand_types>>7)&1){
case OPERAND_TYPE_FLOW_DATA_VAR:{
struct ofl_exp_set_flow_data_variable p = (struct ofl_exp_set_flow_data_variable)
{.table_id = pkt->table_id,
.flow_data_variable_id = act->output,
.key_len = key_len,
.value = result1,
.mask = 0xFFFFFFFF,
.key = {}};
memcpy(p.key, key, key_len);
state_table_set_flow_data_variable(table,&p);
break;}
case OPERAND_TYPE_GLOBAL_DATA_VAR:{
table->global_data_var[act->output] = result1;
OFL_LOG_DBG(LOG_MODULE, "Global data variable %d updated to %"PRIu32,act->output,table>global_data_var[act->output]);
break;}
}
//result2 is written in operand_2
switch((act->operand_types>>12)&3){
case OPERAND_TYPE_FLOW_DATA_VAR:{
struct ofl_exp_set_flow_data_variable p = (struct ofl_exp_set_flow_data_variable)
{.table_id = pkt->table_id,
.flow_data_variable_id = act->operand_2,
.key_len = key_len,
.value = result2,
.mask = 0xFFFFFFFF,
.key = {}};
memcpy(p.key, key, key_len);
state_table_set_flow_data_variable(table,&p);
break;}
case OPERAND_TYPE_GLOBAL_DATA_VAR:{
table->global_data_var[act->operand_2] = result2;
OFL_LOG_DBG(LOG_MODULE, "Global data variable %d updated to %"PRIu32,act->operand_2,table>global_data_var[act->operand_2]);
break;}
}
break;
}
case OPCODE_VAR:{
//result1 is written in output
switch((act->operand_types>>7)&1){
case OPERAND_TYPE_FLOW_DATA_VAR:{
struct ofl_exp_set_flow_data_variable p = (struct ofl_exp_set_flow_data_variable)
{.table_id = pkt->table_id,
.flow_data_variable_id = act->output,
.key_len = key_len,
.value = result1,
.mask = 0xFFFFFFFF,
.key = {}};
memcpy(p.key, key, key_len);
state_table_set_flow_data_variable(table,&p);
break;}
case OPERAND_TYPE_GLOBAL_DATA_VAR:{
BEBA/WP2 – D2.4 Version: 1.0 Page 21 of 49 BEBA
Behavioural Based
Forwarding Grant Agreement: 644122
Beba
BEhavioural BAsed forwarding
table->global_data_var[act->output] = result1;
OFL_LOG_DBG(LOG_MODULE, "Global data variable %d updated to %"PRIu32,act->output,table>global_data_var[act->output]);
break;}
}
//result2 is written in operand_2
switch((act->operand_types>>12)&3){
case OPERAND_TYPE_FLOW_DATA_VAR:{
struct ofl_exp_set_flow_data_variable p = (struct ofl_exp_set_flow_data_variable)
{.table_id = pkt->table_id,
.flow_data_variable_id = act->operand_2,
.key_len = key_len,
.value = result2,
.mask = 0xFFFFFFFF,
.key = {}};
memcpy(p.key, key, key_len);
state_table_set_flow_data_variable(table,&p);
break;}
case OPERAND_TYPE_GLOBAL_DATA_VAR:{
table->global_data_var[act->operand_2] = result2;
OFL_LOG_DBG(LOG_MODULE, "Global data variable %d updated to %"PRIu32,act->operand_2,table>global_data_var[act->operand_2]);
break;}
}
//result3 is written in operand_3
switch((act->operand_types>>10)&3){
case OPERAND_TYPE_FLOW_DATA_VAR:{
struct ofl_exp_set_flow_data_variable p = (struct ofl_exp_set_flow_data_variable)
{.table_id = pkt->table_id,
.flow_data_variable_id = act->operand_3,
.key_len = key_len,
.value = result3,
.mask = 0xFFFFFFFF,
.key = {}};
memcpy(p.key, key, key_len);
state_table_set_flow_data_variable(table,&p);
break;}
case OPERAND_TYPE_GLOBAL_DATA_VAR:{
table->global_data_var[act->operand_3] = result3;
OFL_LOG_DBG(LOG_MODULE, "Global data variable %d updated to %"PRIu32,act->operand_3,table>global_data_var[act->operand_3]);
break;}
}
break;
}
}
}
The write-context-to-field actions permits to write the flow state, a flow data variable or a
global data variable into a packet header field.
struct ofl_action_set_field * state_table_write_context_to_field(struct state_table *table, struct
ofl_exp_action_write_context_to_field *act, struct packet *pkt) {
struct state_entry *state_entry;
struct ofl_action_set_field *set_field_act;
uint32_t src_value = 0;
switch (act->src_type){
case SOURCE_TYPE_FLOW_DATA_VAR:
state_entry = state_table_lookup(table, pkt);
if(state_entry!=NULL){
src_value = state_entry->flow_data_var[act->src_id];
} else {
OFL_LOG_WARN(LOG_MODULE, "ERROR WRITE CONTEXT TO FIELD at stage %u: flow_data_var cannot be
found", pkt->table_id);
return NULL;
}
break;
case SOURCE_TYPE_GLOBAL_DATA_VAR:
BEBA/WP2 – D2.4 Version: 1.0 Page 22 of 49 BEBA
Behavioural Based
Forwarding Grant Agreement: 644122
Beba
BEhavioural BAsed forwarding
src_value = table->global_data_var[act->src_id];
break;
case SOURCE_TYPE_STATE:
state_entry = state_table_lookup(table, pkt);
if(state_entry!=NULL){
src_value = (uint32_t) state_entry->state;
} else {
OFL_LOG_WARN(LOG_MODULE, "ERROR WRITE CONTEXT TO FIELD at stage %u: state cannot be found", pkt>table_id);
return NULL;
}
break;
}
// build a dummy ofl_action_set_field to re-use code from standard OpenFlow set-field action
set_field_act = (struct ofl_action_set_field *)malloc(sizeof(struct ofl_action_set_field));
set_field_act->field = (struct ofl_match_tlv*) malloc(sizeof(struct ofl_match_tlv));
set_field_act->field->header = act->dst_field;
set_field_act->field->value = malloc(OXM_LENGTH(set_field_act->field->header));
//memcpy size is min_size(src_value and dst_field)
if (OXM_LENGTH(set_field_act->field->header)>sizeof(src_value)) {
memcpy(set_field_act->field->value , &src_value, sizeof(src_value));
} else {
memcpy(set_field_act->field->value , &src_value, OXM_LENGTH(set_field_act->field->header));
}
return set_field_act;
}
1.2 Extended BEBA basic controller prototype As for the BEBA controller, we selected the OpenFlow controller Ryu [2] and further extended
our modified version of Ryu to add the support for the new required messages.
An overview of the Ryu OpenFlow controller can be found in section 1.3.1 of D2.2.
In order to support advanced BEBA, the 2 modules which had been introduced for the basic
BEBA (ryu.ofproto.beba_v1_0 and ryu.ofproto.beba_v1_0_parser) have been completed
to provide all the methods to deal with new functionalities.
Exactly like the basic BEBA, we chose an approach based on the OpenFlow Experimenter to
guarantee a minimal departure from the original Ryu code and ensure easy maintenance of
code during update operations of the main code base.
1.2.1 Implementation of the BEBA extensions In this section, the controller side of the XFSM instantiation is described with a simple
example: we want to limit the maximum amount of bytes sent from a given MAC address.
This application requires to keep a state for each source MAC address, thus lookup-scope and
update-scope are set to OXM_OF_ETH_DST. The code for enabling the statefulness and to
configure the two scopes is exactly the same presented in section 1.3.3 of D2.2.
""" Set table 0 as stateful """
req = bebaparser.OFPExpMsgConfigureStatefulTable(
datapath=datapath,
table_id=0,
stateful=1)
datapath.send_msg(req)
BEBA/WP2 – D2.4 Version: 1.0 Page 23 of 49 BEBA
Behavioural Based
Forwarding Grant Agreement: 644122
Beba
BEhavioural BAsed forwarding
""" Set lookup extractor = {eth_src} """
req = bebaparser.OFPExpMsgKeyExtract(datapath=datapath,
command=bebaproto.OFPSC_EXP_SET_L_EXTRACTOR,
fields=[ofproto.OXM_OF_ETH_SRC],
table_id=0)
datapath.send_msg(req)
""" Set update extractor = {eth_src} """
req = bebaparser.OFPExpMsgKeyExtract(datapath=datapath,
command=bebaproto.OFPSC_EXP_SET_U_EXTRACTOR,
fields=[ofproto.OXM_OF_ETH_SRC],
table_id=0)
datapath.send_msg(req)
The object returned by the three function calls, req, is a Ryu API objects representing a
standard OpenFlow OFPExperimenter message, not a custom BEBA object and thus it is
possible to send it to the switch using method datapath.send_msg(req).
Suppose now we want to configure the following condition: TOT_SENT_BYTES≥1000. The
condition is made of a flow data variable, a constant and a “≥” condition.
Conditions are configured through the OFPExpMsgSetCondition()function
req = bebaparser.OFPExpMsgSetCondition(
datapath=datapath,
table_id=0,
condition_id=0,
condition=bebaproto.CONDITION_GTE,
operand_1_fd_id=0,
operand_2_gd_id=1
)
datapath.send_msg(req)
which produces a State Modification message with command OFPSC_EXP_SET_CONDITION.
The function parameters are the ID of the stage, the ID of the condition, the condition operator
and the two operands. In this case we are setting a “≥” condition with ID=0 between the first
flow data variable and the second global data variable.
def OFPExpMsgSetCondition(datapath, table_id, condition_id, condition, operand_1_fd_id=None, operand_1_gd_id=None,
operand_1_hf_id=None, operand_2_fd_id=None, operand_2_gd_id=None, operand_2_hf_id=None):
command=bebaproto.OFPSC_EXP_SET_CONDITION
if condition_id<0 or condition_id>bebaproto.MAX_CONDITIONS_NUM-1:
LOG.debug("OFPExpMsgSetCondition: invalid condition_id")
if condition<0 or condition>5:
LOG.debug("OFPExpMsgSetCondition: invalid condition")
if sum(1 for i in [operand_1_fd_id,operand_1_gd_id,operand_1_hf_id] if i != None)!=1:
LOG.debug("OFPExpMsgSetCondition: you need to choose exactly one type of operand_1")
if sum(1 for i in [operand_2_fd_id,operand_2_gd_id,operand_2_hf_id] if i != None)!=1:
LOG.debug("OFPExpMsgSetCondition: you need to choose exactly one type of operand_2")
# operand_types=xxyy0000 where xx=operand_1_type and yy=operand_2_type
if operand_1_fd_id!=None:
if operand_1_fd_id<0 or operand_1_fd_id>bebaproto.MAX_FLOW_DATA_VAR_NUM-1:
LOG.debug("OFPExpMsgSetCondition: invalid flow data variable ID")
operand_types=bebaproto.OPERAND_TYPE_FLOW_DATA_VAR<<6
operand_1=operand_1_fd_id
elif operand_1_gd_id!=None:
if operand_1_gd_id<0 or operand_1_gd_id>bebaproto.MAX_GLOBAL_DATA_VAR_NUM-1:
LOG.debug("OFPExpMsgSetCondition: invalid global data variable ID")
BEBA/WP2 – D2.4 Version: 1.0 Page 24 of 49 Beba
BEBA
Behavioural Based
Forwarding Grant Agreement: 644122
BEhavioural BAsed forwarding
operand_types=bebaproto.OPERAND_TYPE_GLOBAL_DATA_VAR<<6
operand_1=operand_1_gd_id
elif operand_1_hf_id!=None:
if operand_1_hf_id<0 or operand_1_hf_id>bebaproto.MAX_HEADER_FIELDS-1:
LOG.debug("OFPExpMsgSetCondition: invalid header field ID")
operand_types=bebaproto.OPERAND_TYPE_HEADER_FIELD<<6
operand_1=operand_1_hf_id
if operand_2_fd_id!=None:
if operand_2_fd_id<0 or operand_2_fd_id>bebaproto.MAX_FLOW_DATA_VAR_NUM-1:
LOG.debug("OFPExpMsgSetCondition: invalid flow data variable ID")
operand_types=operand_types | bebaproto.OPERAND_TYPE_FLOW_DATA_VAR<<4
operand_2=operand_2_fd_id
elif operand_2_gd_id!=None:
if operand_2_gd_id<0 or operand_2_gd_id>bebaproto.MAX_GLOBAL_DATA_VAR_NUM-1:
LOG.debug("OFPExpMsgSetCondition: invalid global data variable ID")
operand_types=operand_types | bebaproto.OPERAND_TYPE_GLOBAL_DATA_VAR<<4
operand_2=operand_2_gd_id
elif operand_2_hf_id!=None:
if operand_2_hf_id<0 or operand_2_hf_id>bebaproto.MAX_HEADER_FIELDS-1:
LOG.debug("OFPExpMsgSetCondition: invalid header field ID")
operand_types=operand_types | bebaproto.OPERAND_TYPE_HEADER_FIELD<<4
operand_2=operand_2_hf_id
data=struct.pack(bebaproto.OFP_EXP_STATE_MOD_PACK_STR, command)
data+=struct.pack(bebaproto.OFP_EXP_STATE_MOD_SET_CONDITION_PACK_STR,table_id,condition_id,condition,operand_types
,operand_1,operand_2)
exp_type=bebaproto.OFPT_EXP_STATE_MOD
return ofproto_parser.OFPExperimenter(datapath=datapath, experimenter=0xBEBABEBA, exp_type=exp_type,
data=data)
To be able to correctly compute such condition, we need to store the constant “1000” in a
global data variable (for example the second one). This can be accomplished with the
OFPExpMsgsSetGlobalDataVariable() function.
req = bebaparser.OFPExpMsgsSetGlobalDataVariable(
datapath=datapath,
table_id=0,
global_data_variable_id=1,
value=1000
)
datapath.send_msg(req)
This
function
will
create
a
OFPSC_EXP_SET_GLOBAL_DATA_VAR.
State
Modification
message
with
command
def OFPExpMsgsSetGlobalDataVariable(datapath, table_id, global_data_variable_id, value, mask=0xffffffff):
command=bebaproto.OFPSC_EXP_SET_GLOBAL_DATA_VAR
if global_data_variable_id<0 or global_data_variable_id>bebaproto.MAX_GLOBAL_DATA_VAR_NUM-1:
LOG.debug("OFPExpMsgsSetGlobalDataVariable: invalid global_data_variable_id")
data=struct.pack(bebaproto.OFP_EXP_STATE_MOD_PACK_STR, command)
data+=struct.pack(bebaproto.OFP_EXP_STATE_MOD_SET_GLOBAL_DATA_VAR_PACK_STR,table_id,global_data_variable_id,value,
mask)
exp_type=bebaproto.OFPT_EXP_STATE_MOD
return ofproto_parser.OFPExperimenter(datapath=datapath, experimenter=0xBEBABEBA, exp_type=exp_type,
data=data)
BEBA/WP2 – D2.4 Version: 1.0 Page 25 of 49 BEBA
Behavioural Based
Forwarding Grant Agreement: 644122
Beba
BEhavioural BAsed forwarding
We need finally to install the flow table entries which match on the condition and which update
the flow data variable over which the condition is evaluated. We can use the standard
OFPFMatch()and OFPFlowMod() Ryu APIs.
match = ofparser.OFPMatch(condition0=0)
actions = [bebaparser.OFPExpActionSetDataVariable(table_id=0, opcode=bebaproto.OPCODE_SUM, output_fd_id=0,
operand_1_fd_id=0, operand_2_hf_id=0),
ofparser.OFPActionOutput(ofproto.OFPP_FLOOD)]
inst = [ofparser.OFPInstructionActions(ofproto.OFPIT_APPLY_ACTIONS, actions)]
mod = ofparser.OFPFlowMod(datapath=datapath, table_id=0,priority=0, match=match, instructions=inst)
datapath.send_msg(mod)
match = ofparser.OFPMatch(condition0=1)
actions = [bebaparser.OFPExpActionSetDataVariable(table_id=0, opcode=bebaproto.OPCODE_SUM, output_fd_id=0,
operand_1_fd_id=0, operand_2_hf_id=0)]
inst = [ofparser.OFPInstructionActions(ofproto.OFPIT_APPLY_ACTIONS, actions)]
mod = ofparser.OFPFlowMod(datapath=datapath, table_id=0,priority=0, match=match, instructions=inst)
datapath.send_msg(mod)
We modified our beba_v1_0.py library to define 11 new matching fields to be able to install
flow entries matching on the result of up to 8 conditions, on the packet time stamp, on a
random number and on the packet length (in bytes) directly through the standard OFPMatch()
function.
# Beba experimenter fields
oxm_types = [
oxm_fields.BebaExperimenter('global_state', 0, type_desc.Int4),
oxm_fields.BebaExperimenter('state', 1, type_desc.Int4),
oxm_fields.BebaExperimenter('condition0', 2, type_desc.Int1),
oxm_fields.BebaExperimenter('condition1', 3, type_desc.Int1),
oxm_fields.BebaExperimenter('condition2', 4, type_desc.Int1),
oxm_fields.BebaExperimenter('condition3', 5, type_desc.Int1),
oxm_fields.BebaExperimenter('condition4', 6, type_desc.Int1),
oxm_fields.BebaExperimenter('condition5', 7, type_desc.Int1),
oxm_fields.BebaExperimenter('condition6', 8, type_desc.Int1),
oxm_fields.BebaExperimenter('condition7', 9, type_desc.Int1),
oxm_fields.BebaExperimenter('timestamp', 10, type_desc.Int4),
oxm_fields.BebaExperimenter('random', 11, type_desc.Int2),
oxm_fields.BebaExperimenter('pkt_len', 12, type_desc.Int2)
]
Ryu controller will then call our OFPExpActionSetDataVariable() function when the Flow
Modification message has to be created.
def OFPExpActionSetDataVariable(table_id, opcode, output_gd_id=None, output_fd_id=None, operand_1_fd_id=None,
operand_1_gd_id=None, operand_1_hf_id=None, operand_2_fd_id=None, operand_2_gd_id=None, operand_2_hf_id=None,
operand_2_cost=None, operand_3_fd_id=None, operand_3_gd_id=None, operand_3_hf_id=None,operand_4_fd_id=None,
operand_4_gd_id=None, operand_4_hf_id=None, coeff_1=0, coeff_2=0, coeff_3=0, coeff_4=0, fields=[]):
"""
Returns a Set Data Variable experimenter action
This action updates the flow data/global data variable.
================ ======================================================
Attribute
Description
================ ======================================================
table_id
Stage ID
opcode
Operation code
output_XX_id
ID of destination global/data variable
operand_1_XX_id ID of first global/data variable/header field
operand_2_XX_id ID of second global/data variable/header field/constant
operand_3_XX_id ID of third global/data variable/header field
operand_4_XX_id ID of fourth global/data variable/header field
================ ======================================================
"""
field_count=len(fields)
if field_count > bebaproto.MAX_FIELD_COUNT:
field_count = 0
LOG.debug("OFPExpActionSetState: Number of fields given > MAX_FIELD_COUNT")
BEBA/WP2 – D2.4 Version: 1.0 Page 26 of 49 BEBA
Behavioural Based
Forwarding Grant Agreement: 644122
Beba
BEhavioural BAsed forwarding
act_type=bebaproto.OFPAT_EXP_SET_DATA_VAR
if opcode>bebaproto.OPCODE_POLY_SUM:
LOG.debug("OFPExpActionSetDataVariable: invalid opcode")
if sum(1 for i in [output_gd_id,output_fd_id] if i != None)!=1:
LOG.debug("OFPExpActionSetDataVariable: you need to choose exactly one type of output operand")
if sum(1 for i in [operand_1_fd_id,operand_1_gd_id,operand_1_hf_id] if i != None)!=1:
LOG.debug("OFPExpActionSetDataVariable: you need to choose exactly one type of first operand")
if sum(1 for i in [operand_2_fd_id,operand_2_gd_id,operand_2_hf_id,operand_2_cost] if i != None)!=1:
LOG.debug("OFPExpActionSetDataVariable: you need to choose exactly one type of second output operand")
# operand_types=aabbccdde0000000 where aa=operand_1_type, bb=operand_2_type, cc=operand_3_type,
dd=operand_4_type and e=output_type
if operand_1_fd_id!=None:
if operand_1_fd_id<0 or operand_1_fd_id>bebaproto.MAX_FLOW_DATA_VAR_NUM-1:
LOG.debug("OFPExpActionSetDataVariable: invalid flow data variable ID")
operand_types=bebaproto.OPERAND_TYPE_FLOW_DATA_VAR<<14
operand_1=operand_1_fd_id
elif operand_1_gd_id!=None:
if operand_1_gd_id<0 or operand_1_gd_id>bebaproto.MAX_GLOBAL_DATA_VAR_NUM-1:
LOG.debug("OFPExpActionSetDataVariable: invalid global data variable ID")
operand_types=bebaproto.OPERAND_TYPE_GLOBAL_DATA_VAR<<14
operand_1=operand_1_gd_id
elif operand_1_hf_id!=None:
if operand_1_hf_id<0 or operand_1_hf_id>bebaproto.MAX_HEADER_FIELDS-1:
LOG.debug("OFPExpActionSetDataVariable: invalid header field ID")
operand_types=bebaproto.OPERAND_TYPE_HEADER_FIELD<<14
operand_1=operand_1_hf_id
[…]
if coeff_1<-128 or coeff_1>127:
LOG.debug("OFPExpActionSetDataVariable:
if coeff_2<-128 or coeff_2>127:
LOG.debug("OFPExpActionSetDataVariable:
if coeff_3<-128 or coeff_3>127:
LOG.debug("OFPExpActionSetDataVariable:
if coeff_4<-128 or coeff_4>127:
LOG.debug("OFPExpActionSetDataVariable:
requires -128 <= coeff_1 <= 127")
requires -128 <= coeff_2 <= 127")
requires -128 <= coeff_3 <= 127")
requires -128 <= coeff_4 <= 127")
if output_fd_id!=None:
if output_fd_id<0 or output_fd_id>bebaproto.MAX_FLOW_DATA_VAR_NUM-1:
LOG.debug("OFPExpActionSetDataVariable: invalid flow data variable ID")
operand_types=operand_types | bebaproto.OPERAND_TYPE_FLOW_DATA_VAR<<7
output=output_fd_id
elif output_gd_id!=None:
if output_gd_id<0 or output_gd_id>bebaproto.MAX_GLOBAL_DATA_VAR_NUM-1:
LOG.debug("OFPExpActionSetDataVariable: invalid global data variable ID")
operand_types=operand_types | bebaproto.OPERAND_TYPE_GLOBAL_DATA_VAR<<7
output=output_gd_id
data=struct.pack(bebaproto.OFP_EXP_ACTION_SET_DATA_VARIABLE_PACK_STR, act_type, operand_types, table_id,
opcode, output, operand_1, operand_2, operand_3, operand_4, coeff_1, coeff_2, coeff_3, coeff_4, field_count)
field_extract_format='!I'
if field_count <= bebaproto.MAX_FIELD_COUNT:
for f in range(field_count):
data+=struct.pack(field_extract_format,fields[f])
# Actions must be 64 bit aligned! Fields are 32bits, so we need padding only if field_count is odd
if field_count>0 and field_count%2!=0:
data+=struct.pack(field_extract_format,0)
return ofproto_parser.OFPActionExperimenterUnknown(experimenter=0xBEBABEBA, data=data)
BEBA/WP2 – D2.4 Version: 1.0 Page 27 of 49 Beba
BEBA
Behavioural Based
Forwarding Grant Agreement: 644122
BEhavioural BAsed forwarding
In our example application we have defined the following update function
TOT_SENT_BYTES+=PKT_LEN, which can be mapped on FDV[0]=FDV[0]+HF[0], thus we
need to configure an header field extractor (in this case the first one) to extract the packet size
(in bytes). We can do this with the OFPExpMsgHeaderFieldExtract() function.
req = bebaparser.OFPExpMsgHeaderFieldExtract(
datapath=datapath,
table_id=0,
extractor_id=0,
field=bebaproto.OXM_EXP_PKT_LEN
)
datapath.send_msg(req)
This function will create a State
SET_HEADER_FIELD_EXTRACTOR.
Modification
message
with
command
OFPSC_EXP_
def OFPExpMsgHeaderFieldExtract(datapath, table_id, extractor_id, field):
command=bebaproto.OFPSC_EXP_SET_HEADER_FIELD_EXTRACTOR
if extractor_id<0 or extractor_id>bebaproto.MAX_HEADER_FIELDS-1:
LOG.debug("OFPExpMsgHeaderFieldExtract: invalid extractor_id")
data=struct.pack(bebaproto.OFP_EXP_STATE_MOD_PACK_STR, command)
data+=struct.pack(bebaproto.OFP_EXP_STATE_MOD_SET_HEADER_EXTRACTOR_PACK_STR,table_id,extractor_id,field)
exp_type=bebaproto.OFPT_EXP_STATE_MOD
return ofproto_parser.OFPExperimenter(datapath=datapath, experimenter=0xBEBABEBA, exp_type=exp_type,
data=data)
Another message (which is not needed by the application example) is the State Modification
message with command OFPSC_EXP_SET_FLOW_DATA_VAR which allow to set the value of a
specific flow data variable by providing the new value (eventually masked) and a key to access
the flow context from the state table.
def OFPExpMsgSetFlowDataVariable(datapath, table_id, flow_data_variable_id, keys, value, mask=0xffffffff):
command=bebaproto.OFPSC_EXP_SET_FLOW_DATA_VAR
key_count=len(keys)
if flow_data_variable_id<0 or flow_data_variable_id>bebaproto.MAX_FLOW_DATA_VAR_NUM-1:
LOG.debug("OFPExpMsgSetFlowDataVariable: invalid flow_data_variable_id")
if key_count > bebaproto.MAX_KEY_LEN:
key_count = 0
LOG.debug("OFPExpMsgSetFlowDataVariable: Number of keys given > MAX_KEY_LEN")
data=struct.pack(bebaproto.OFP_EXP_STATE_MOD_PACK_STR, command)
data+=struct.pack(bebaproto.OFP_EXP_STATE_MOD_SET_FLOW_DATA_VAR_PACK_STR, table_id, flow_data_variable_id,
key_count, value, mask)
field_extract_format='!B'
if key_count <= bebaproto.MAX_KEY_LEN:
for f in range(key_count):
data+=struct.pack(field_extract_format,keys[f])
exp_type=bebaproto.OFPT_EXP_STATE_MOD
return ofproto_parser.OFPExperimenter(datapath=datapath, experimenter=0xBEBABEBA, exp_type=exp_type,
data=data)
All these function can be accessed from a Ryu application by importing the two BEBA modules
BEBA/WP2 – D2.4 Version: 1.0 Page 28 of 49 BEBA
Behavioural Based
Forwarding Grant Agreement: 644122
Beba
BEhavioural BAsed forwarding
import ryu.ofproto.beba_v1_0 as bebaproto
import ryu.ofproto.beba_v1_0_parser as bebaparser
The second module implements all the functions presented so far, while the first one defines all
the constants and the wire format of each message and data structure exchanged between the
controller and the switch.
1.3 State synchronization mechanism To allow the synchronization of the states between the switches and the controller, we
presented the state synchronization API, as part of the basic BEBA API.
This mechanism has been extended to report to the controller not only the list of states but
also the list of flow data variables.
The data structures of both the switch and the controller for request and reply messages have
been left unchanged, but the message format of the single state entry has been modified.
struct ofp_exp_state_entry{
uint32_t
key_len;
uint8_t
key[OFPSC_MAX_KEY_LEN];
uint32_t
state;
uint32_t
flow_data_var[OFPSC_MAX_FLOW_DATA_VAR_NUM];
};
OFP_ASSERT(sizeof(struct ofp_exp_state_entry) == 80);
BEBA/WP2 – D2.4 Version: 1.0 Page 29 of 49 BEBA
Behavioural Based
Forwarding Grant Agreement: 644122
Beba
BEhavioural BAsed forwarding
2 FPGA-­‐‑based hardware proof of concept prototype 2.1 OPP prototype architecture design The OPP architecture described in deliverable D2.3 has been implemented as a proof-ofconcept prototype using the SUME NetFPGA, a PCIe board equipped with a last generation
Xilinx FPGA and 4 10 GbE ports. The aim of the prototype is to prove the hardware feasibility
of the proposed architecture and to obtain insight in the possible issues and shortcoming (in
terms of latency, scalability etc.) related to the possible ASIC implementation of the proposed
OPP architecture.
Figure 2 Scheme of an OPP stage
Figure 2 provides a block-level overview of the realized hardware implementation of the single
OPP stage. Pipelining of an OPP stage with other OPP stages or ordinary match/action tables
does not affect the single stage design. Figure 1 also illustrates the necessary auxiliary blocks
devised to handle packet input capture and output delivery, in the assumption of a 4 × 4 port
switch. In this section, the main blocks composing the OPP stage prototype are described.
BEBA/WP2 – D2.4 Version: 1.0 Page 30 of 49 BEBA
Behavioural Based
Forwarding Grant Agreement: 644122
Beba
BEhavioural BAsed forwarding
Figure 3 Condition logic block array element
2.1.1 Packet reception and header field extraction Packets received on the input queues are collected and serialized by a mixer block, so that the
OPP block receives one packet per clock cycle. Such packet is then processed by a Packet
Fields Extractor, configured to provide, together with the header fields (8 in our prototype),
the blocks required in the next processing stages - specifically:
i) the Flow Key used to query the Flow Context Table,
ii) the header fields used by the Condition block,
iii) the header fields used by the Update Logic Block, and
iv) the (possibly different) Flow Key used for updating the Flow Context.
The Packet Fields Extractor is easily implemented in HW as a parallel array of elementary Shift
and Mask (SaM) blocks where each SaM block selects the beginning of the targeted header
field (the shift function), and performs a bit-wise mask operation.
2.1.2 Flow Context Table This data structure is in charge to store both state as well as registries associated to Flow
Keys. It consists of an hash table (we implemented a d-left hash table with d = 4) to handle
exact matches, plus a TCAM to handle wildcard matches. Unlike the hash table, which must be
arguably large to store per-flow states, a very small TCAM can be deployed, as it is required to
handle the very few special cases where wildcard matches are needed (mainly default states,
where the TCAM priority permits to differentiate default states for different protocols or packet
formats). Our implementation uses 128 bit Flow Keys, and returns a 146 bit value which is
sufficient to support a 16 bit state label, four 32 bit per-flow registries, and two auxiliary bits
per entry used by the microcontroller for housekeeping (see below).
BEBA/WP2 – D2.4 Version: 1.0 Page 31 of 49 BEBA
Behavioural Based
Forwarding Grant Agreement: 644122
Beba
BEhavioural BAsed forwarding
2.1.3 Condition Logic Block This block permits to configure conditions on input pairs (per-flow registries, global registries,
header fields), and evaluate them to return as output a boolean 0/1 vector. This block, shown
in figure 3, comprises multiple (8 in our implementation) parallel configurable comparators,
each of which takes as input two operands selected among all the flow registries Ri , all the
global registries Gi and the header fields Hi coming from the packet field extractor. Two
multiplexers (one for each operand) provide the selection operation. Each comparator supports
five arithmetic comparison functions: >, ≥, =, ≤.
2.1.4 XFSM Table. While, conceptually, this is a key “computational” stage in our proposed architecture (it
performs a state transition step), in practice its implementation is straightforward: it just relies
on an ordinary TCAM. Although a posteriori it may seem obvious, such a simple support for a
“full” XFSM transition step was enabled by the clear distinction between the configuration and
evaluation of conditions (by the Condition Logic Block) and their usage as boolean outcomes,
hence one bit per condition which can be directly used as TCAM input, along with the state
label and the usual packet header fields used in OpenFlow matches. As in standard
implementations, the TCAM provides as output the row associated to the matching rule with
higher priority, and is followed by a companion RAM which stores the associated output. In our
specific case, this consists in
i) the next state label (16 bits) used to update the flow context table,
ii) the action to perform on the packet (16 bits) and the ALU instructions which
shall be applied to update registries (our prototype supports up to 5 instructions
of 32 bits each).
2.1.5 Update Logic Block. This is the second “computational” stage in our architecture. This block deploys an array of
ALUs (Arithmetic and Logic Units) which support a specific set of (micro)instructions useful for
traffic processing tasks, and which execute in parallel the instructions provided as output of
the XFSM Table. The updated registry values are then stored in the relevant memory locations
(flow registries and/or global registries). This block comprises a number (5 in our
implementation) of small parallel ALUs (Arithmetic Logic Units) able to perform a set of
elementary instructions, which frequently occur in traffic control applications. The ones
specifically implemented in our prototype are listed in tables I and II.
BEBA/WP2 – D2.4 Version: 1.0 Page 32 of 49 BEBA
Behavioural Based
Forwarding Grant Agreement: 644122
Beba
BEhavioural BAsed forwarding
TABLE I: ALU basic instruction set
Instructions in Table II are those of a typical RISC architecture, while instructions in Table III
are specific for packet processing tasks. At each step, the specific computations that the
Update Logic Block must perform are provided by the output of the XFSM transition, and are
expressed in the form of a tuple of instructions (32 bit instructions in our prototype). Each
instruction comprises an 8 bits OP CODE, followed by a variable number of operands that
depend on the specific instruction. Input operands (INi) can be any among the available per
flow registries Ri, the global variables Gi, or the header fields Hi provided by the Extractor.
Output operands (OUTi ) indicate where the result of the instruction must be written (e.g. in a
given per-flow register, or in a global variable). In some instructions, one or more of the
operands (IOi) are both used as input and output. Our implementation supports 4 per-flow
registries, 4 global registries and 8 header fields. Therefore, it may in principle support up to
24/ log2 (16) = 6 operands. In practice, we envision at most 4 operands (e.g., for the variance
or for the ewma smoothing instructions) and thus our implementation may readily support up
to 64 among registries and header fields. In the case of logic/arithmetic/shift operations, which
only require at most two operands plus a third output, we have also considered the case in
which one of the operands is an actual value (immediate value) which can hence use 16 bits.
The packet/flow specific instructions supported in our prototype do implement, as a dedicated
HW primitives running at the system clock frequency and with a maximum latency of two clock
cycles1, domain-specific operations which we deem useful in traffic control applications, and
which would normally require multiple clock cycles if implemented using more elementary
operations. Such domain specific operations include the online computation of running
averages (avg) and variances (var), and the computation of exponentially decaying moving
averages (ewma) which can serve the purpose of a moving average, but which can be
incrementally computed and do not require maintaining a window of samples.
1
As they involve a division, which we had to limit to 16 bits for dividend and divisor to target a 2 clock cycles latency.
BEBA/WP2 – D2.4 Version: 1.0 Page 33 of 49 Beba
BEBA
Behavioural Based
Forwarding Grant Agreement: 644122
BEhavioural BAsed forwarding
TABLE II: ALU packet/flow specific instructions
Usage and implementation details about packet/flow specific instructions are provided in Table
II. The avg operation stores the number of samples in IO1, and includes a new sample IN1 in
the running average IO2. Similarly, the var operation stores the number of samples in IO1, the
average of the value IN1 in IO2 and the variance in IO3. The ewma operation2 was included to
permit smoothing. It stores the last timestamp (IN1) of a packet in the register identified by
IO1, computes the exponentially weighted moving average of the value IN2 using the equation
in Table III and stores the result in IO2. As a final remark, similar to the action set in standard
OpenFlow, we stress that the specific instruction set provided by the Update Logic Block is
independent of our proposed OPP abstraction, i.e., its extension or improvement (e.g. with
further dedicated domain-specific instructions) does not affect the overall OPP design.
2.1.6 Miscellaneous Blocks and Microcontroller To complete the HW architecture, a few necessary extra blocks have been implemented. The
Action Block applies the selected actions to the packet and is perfectly analogous to an
OpenFlow implementation. Being just a proof-of-concept, our prototype implements only basic
“sample” actions (drop, forward, flood). Global registries are implemented as a standard
register file unit for concurrent access. The Metadata block is in charge to provide additional
information associated to an arriving packet, i.e., input port and timestamp. Finally, our
prototype has been complemented with a microcontroller providing a communication interface
(UART) to configure the various programmable components inside the OPP (configuration
registers, TCAM and RAM memories, etc.), i.e. to deploy in the switch an externally
2
Being tk the last sample time, and xk a new sample occurring at time tk0, for simplicity of HW implementation we
approximate the exponentially weighted moving average as m(tk0) = m(t k)
α^(tk0
−tk)+ xk0, and we use
α
= 1/2 to
compute powers as shift operations. The intermediate decay quantity in the second line is used just for clarity of
presentation.
BEBA/WP2 – D2.4 Version: 1.0 Page 34 of 49 BEBA
Behavioural Based
Forwarding Grant Agreement: 644122
Beba
BEhavioural BAsed forwarding
programmed application. Each configurable quantity is memory mapped in the microcontroller
address space, which can directly read/write the content of these components. The
microcontroller further implements management functions, among which slow-time-scale flow
context table management (housekeeping): the microcontroller periodically scans the entries
in the flow context table to detect and clean stale entries. To this purpose, two activity flag bits
are stored in each flow entry and permit to label entries as ACTIVE, INACTIVE (no accesses
have occurred in a configurable management cycle, e.g., order of seconds), and DELETED. It is
worth to note that this is the only operation performed by the OPP that is not triggered by a
packet.
2.1.7 Direct memory access via PCIe A second way to configure the prototype is implemented by using DMA/PCI. This approach
allows to directly access to memory system without need to have an agent running on the
microcontroller that support transitions. While the microcontroller is hosted on the prototype
and must be programmed to access to the OPP memory address space, the DMA/PCI interface
connects the OPP with the operating system of the PC host in which the NetFPGA is hosted.
Thus DMA allows to run configuration software outside the FPGA and access to prototype
component through PCI architecture allowing to request control of the bus and request to read
from and write to system memory. As for microcontroller, we access to memory system not to
just configure the prototype but also to retrieve debug/status information (such as look up on
hash table, packet register etc.).
2.2 FPGA prototype and initial evaluation of ASIC implementation 2.2.1 FPGA prototype The OPP HW prototype has been designed using as target development board the NetFPGA
SUME [1], an x8 Gen3 PCIe adapter card incorporating a Xilinx Virtex-7 690T FPGA [3], four
SFP+ transceivers providing four 10GbE links, three 72 Mbits QDR II SRAM and two 4GB DDR3
memories. The FPGA is clocked at 156.25 MHz, with a 64 bits data path from the Ethernet
ports, corresponding to a 10 Gbps throughput per port. The aggregated bus output of the
mixer is 320 bits wide and is able to provide an overall throughput of 50 Gbps. The d-left hash
table implementing the flow context table is sized for 4K entries. In order to support the target
throughput, the RAMs composing the d-left table are realized as dual port RAM, so as to
provide a read and a write operation for each clock cycle. The prototype implements very small
TCAMs. The TCAM associated to the hash table in the flow context table has 32 entries of 128
bits, whereas the XFSM TCAM has 128 entries of 160 bits. Indeed, TCAM implementation over
FPGAs is very inefficient and is currently a widely open research issue [4]–[6], especially since
the priority resolution hardware limits the maximum operating frequency when the number of
TCAM entries increase. We remark that this issue is related to the use of an FPGA as prototype
platform. An OPP ASIC design would be able to provide bigger TCAMs. The system latency, i.e.
BEBA/WP2 – D2.4 Version: 1.0 Page 35 of 49 BEBA
Behavioural Based
Forwarding Grant Agreement: 644122
Beba
BEhavioural BAsed forwarding
the time interval from the first table lookup to the last context update is 6 clock cycles. The
FPGA prototype is able to sustain the full throughput of 40 Gbits/sec provided by the 4 switch
ports. If we suppose a minimum packet size of 40 bytes (320 bits), the system is able to
process 1 packet for each clock cycle, and thus up to 6 packets could be pipelined. However,
the feedback loop (not present in the forward-only OpenFlow pipelines [7]) raises a concern:
the state update performed for a packet at the sixth clock cycle would be missed by pipelined
packets. This could be an issue for packets belonging to a same flow arriving back-to-back
(consecutive clock cycles); in practice, as long as the system is configured to work by
aggregating N ≥ 6 different links, the mixer’s round robin policy will separate two packets
coming from the same link of N clock cycles, thus solving the problem. Note that the 6 clock
cycles latency is fixed by the hardware blocks used in the FPGA (the TCAM and the Block
RAMs) and basically does not change scaling up the number of ingress ports or moving to an
ASIC. The whole system has been synthesized using the standard Xilinx design flow.
TABLE III: Hardware cost of OPP compared with the reference NetFPGA SUME
switch.
Table IV reports the logic and memory resources (in terms of absolute numbers and fraction of
available FPGA resources) used by the OPP FPGA implementation, and compare these results
with those required for the NetFPGA SUME single-stage reference switch. As expected, the
logic uses a small fraction of the total area (the increase with respect the reference switch is
5% of the available FPGA logic resources), that is dominated by memory (that doubles with
respect the reference switch). The synthesis results hence confirm the trend already shown by
[8]: the HW area is dominated by memory, while adding intelligence/features in the logic
require a small silicon overhead.
2.2.2 Performance achievable with an ASIC implementation As previously stated, while an FPGA prototype permits to assess feasibility, a full
performance/scale architecture requires ASIC technology. The main blocks composing the OPP
are similar to those of other proposed programmable switch pipelines. Its novelty is due to the
way in which these basic blocks are connected together to provide architecture described in
the previous sections. Therefore, a first estimation of the achievable performance can be
obtained studying the characteristics and the performance requirements of the OPP building
blocks and comparing them with similar blocks used in other programmable switch pipelines
BEBA/WP2 – D2.4 Version: 1.0 Page 36 of 49 BEBA
Behavioural Based
Forwarding Grant Agreement: 644122
Beba
BEhavioural BAsed forwarding
(mainly [8] and [9]). In particular, the OPP microcontroller, the ingress/egress queues and the
packet switch memory forming the packet queues are the same of other programmable switch
pipelines. Following the technology assumptions of [8], an OPP ASIC design would be able to
manage 64 ports working at 10 Gb/s if clocked at 1GHz. This frequency can be easily achieved
by the blocks composing the Condition Logic Block and the Update Logic Block. These blocks
are similar to the atoms described in [9], where the area of an atom is estimated around 1000
µm2, being the overall chip area around 200 mm2. We remark that the microcontroller can be
also clocked with a lower frequency since it is not used for wire speed processing. Several
microcontroller with very small area footprint are available as an IP block to instantiate in the
chip (see e.g. the ARM Cortex M0 [10] or the Synopsys ARC core [11]). The most critical
blocks in terms of area and speed are therefore the TCAM realizing the XFSM Table and the
SRAM memory realizing the Flow Context Table. However, we expect that the TCAM for the
XFSM table will be smaller than the typical TCAM of stateless stages. In fact, TCAM entries for
stateless stages usually match against several packets headers and directly decide the action
to apply. Instead, the TCAM of stateful stages is used to describe the XFSM behaviour, so we
expect to have a TCAM rule for each XFSM transition. For this motivation, we estimate that the
numbers of TCAM rows of an OPP stage can be of 1K rows of 256 bits (256 Kbits of TCAM)
compared with 1.2 Mbits of TCAM of an RMT stage. We expect that our smaller TCAM should
sustain the 1GHz clock frequency. However, adding more pipeline stages could increase the
maximum frequency, with negligible drawbacks. The flow context table is composed by
multiple SRAM blocks with a data width of 256 bits. The foreseen an overall on-chip SRAM
memory of 32 MB (the RMT SRAM memory is around 40MB). Similarly to RMT, we expect to
configure the available SRAM blocks to provide different flow table size. When all the 32 stages
are stateful and the memory is equally distributed among all the OPP stages, each OPP stage
has 1MB available. Instead, when the number of flows to store increases, it is possible to
allocate all the memory in a single OPP stage.
2.3 Application demo To better understand the basic OPP architecture we will show a simple application developed
on the prototype. This application simply monitors Ethernet destination address in order to
understand which flow receives packets more than a threshold. So OPP can be configured as
follows:
• We define two possible states in which flow involves. We assign value 0x0B (ELEPHANT) if flow exceeds a pre-­‐defined threshold; otherwise with value 0x0A (MICE) if not. • We define threshold in a global register that it will be use by flows. • We define two flow registers: in the first one we will count packet received when threshold is not exceeded; in the other one we will count packet received after threshold is exceeded. • We configure look-­‐up and update extractor to select only eth-­‐dst address; and we configure FSM table to match flow state and condition output (C0: GR0>FR1), as in figure. BEBA/WP2 – D2.4 Version: 1.0 Page 37 of 49 BEBA
Behavioural Based
Forwarding Grant Agreement: 644122
next state 0x0A 0x0A 0x0A 0x0B 0x0B 0x0B C0 state 1 0 0 Beba
BEhavioural BAsed forwarding
packet update function action OUT FlowRegister1++ DROP FlowRegister2++ DROP FlowRegister2++ To initialize demonstration we download and run bitstream on FPGA, it is performed with
command XMD in the Linux terminal on right in figure.
Instead, on the left, another Linux terminal runs software, executed outside the FPGA, that
uses DMA to configure and debug this one. Here firstly we select command to configure FPGA
BEBA/WP2 – D2.4 Version: 1.0 Page 38 of 49 BEBA
Behavioural Based
Forwarding Grant Agreement: 644122
Beba
BEhavioural BAsed forwarding
with application previous described. Afterwards we start to send packet on an interface
attached to the prototype, and then we debug application status looking flow register on hash
table.
Key
Flow
Register
1
Flow
Register
2
Thus, we send first packet with tcpdump and OPP creates a new entry on hash table in which
memorize eth-dst address as a key, three flow-registers and, as last word, present state. The
first flow register is incremented.
Then we send the remained number of packet needed to exceed threshold. Conditions evaluate
that flow register is greater than global register. Hence, boolean vector, in input to FSM table,
involves a different decision inside this one compared to decision taken to previously processed
packet of the same flow. Therefore, hash table shows the transition of the present state to the
second one, the first flow register exceed the threshold and the second one is incremented.
BEBA/WP2 – D2.4 Version: 1.0 Page 39 of 49 BEBA
Behavioural Based
Forwarding Grant Agreement: 644122
Present
evolves!
Beba
BEhavioural BAsed forwarding
state
Finally, we can observe that OPP changes its behaviour on flow processing due to a decision
taken not on information provided by packet but based only from arithmetic operations
computed inside the prototype.
BEBA/WP2 – D2.4 Version: 1.0 Page 40 of 49 BEBA
Behavioural Based
Forwarding Grant Agreement: 644122
Beba
BEhavioural BAsed forwarding
3 Preliminary work on HW-­‐‑based flexible packet parsing P4: Programming Protocol-independent Packet Processors [12,13] is a high-level, platformagnostic language. It represents a recent contribution to the broader idea of SDN and the SDN
ecosystem. Its main purpose is to provide a way to define a packet processing functionality of
network devices, paying attention to reconfigurability in the field, protocol independence and
target platform independence. It works in conjunction with SDN protocols like OpenFlow. The
current form of OpenFlow consists of fixed sets of protocol headers and actions. Such approach
limits required flexibility and extensibility of network hardware. The P4 language solves these
problems because it allows to define a packet processing functionality of data plane.
Using relatively simple syntax, P4 describes five basic aspects of packet processing. That is:
1. Header Formats – It is used for specification of new protocol headers.
2. Packet Parser – This part of P4 language is used for specification of a state machine
which defines parsing process of incoming network data. Each packet parser state
defines transitions to next packet parser states, based on values of observed protocol
header fields.
3. Table Specification – P4 language supports user defined specification of matching
process for extracted data. User can define a set of header fields and matching
algorithm for each header field in the set. Each valid match executes data processing
with given action. All tables are filled during runtime of P4 program.
4. Action Specification - This part of P4 language defines compound actions that may be
executed for incoming packets.
5. Control Program – It defines the control flow of packet between tables based on
hit/miss event, conditional statement, and so on.
The P4 Language Consortium provides some basic tools which can be used in 3rd party
projects. Source codes of these tools are publicly available under open source license. The
main project is P4-HLIR [14] which is the front end of the P4 compiler, creating Python object
model of the P4 program. It becomes useful for other projects because one can easily continue
with implementation of compiler’s back end from this object representation. There are
available two projects (P4C-BEHAVIORAL [15] and P4C-GRAPHS [16]) as examples of back
ends for two different targets (C++ language in case of P4C-BEHAVIORAL and DOT in the case
of P4C-GRAPHS).
3.1.1 Competitive solutions The Network Protocol Description Language (NetPDL) [19] is a XML-based description language
for creation of universal protocol database. It does not support specification of a protocol
temporal behavior (like specification of a protocol state machine) but it is targeted to an
effective description of protocol header formats and protocol encapsulation.
BEBA/WP2 – D2.4 Version: 1.0 Page 41 of 49 BEBA
Behavioural Based
Forwarding Grant Agreement: 644122
Beba
BEhavioural BAsed forwarding
Another P4-like solution is the PP language, which was presented, by Attig and Brebner in
[17]. It servers for description of packet headers and parse graph. Authors also demonstrate
a compiler from PP language to VHDL. Unfortunately, the compiler (including the language
specification) is not publicly available.
Both of these solutions lack wider community support and are therefore not well aligned to
BEBA standardization plans.
3.1.2 Compiling P4 to FPGA Our goal, within BEBA prototyping phase, is to generate a FPGA based parser from P4
language. Therefore, we need to use a subset of P4 language which describes the packet
parser and protocol headers. These two parts of P4 language are used for unique specification
of parsing engine which can be translated to VHDL.
The main idea of automatic generation of high-speed parser comes from the HFE M2
architecture [18]. This architecture supports the modular connection of protocol analyzers
which are separated by pipelines. Protocol analyzer is the main block of the whole approach
because it understands processing of one protocol. A generic interface is used for connection
between modules. There is an optional pipeline block between each two protocol analyzers.
The pipeline blocks can be individually enabled or disabled at compile time to tune the final
frequency, latency and chip area. Protocol analyzers and pipeline blocks are connected to the
processing chain which represents the supported protocol stack of processed network packet.
An example of this architecture is shown in Figure 4.
Figure 4: HFE M2 architecture
Knowing the structure of HFE M2, we can inherit and define the transformation process from
P4 to FPGA (using the VHDL language). First of all, we need to identify and generate protocol
BEBA/WP2 – D2.4 Version: 1.0 Page 42 of 49 BEBA
Behavioural Based
Forwarding Grant Agreement: 644122
Beba
BEhavioural BAsed forwarding
analyzers. These modules can be generated from P4's Header Formats (an example of this
definition will be shown later in this text). After successful generation of protocol analyzers,
we need to connect these modules into processing chain. This task can be generally solved
using a Parser Graph Representation (PGR).
We define the PGR as an oriented graph, which is generated from the P4's Packet Parser
definition. Each node (or state) represents one packet header and each transition represents
the next parsed protocol. Each transition is taken based on the parsed protocol data. Condition
of a transition is inferred from the P4's Packet Parser description. Each PGR node also contains
a pointer to P4's Header Format definition, which is needed during generation of individual
protocol analyzers. The example of a PGR structure is shown in Figure 5.
Figure 5: The example of Parser Graph Representation
The key problem is to infer a place for insertion of each protocol analyzer in the processing
chain. This problem can be solved by the algorithm which identifies the longest paths to each
node in a PGR. Length of the longest path (the number in a node) from root to a node
represents the position in generated processing chain. In other words, the longest path
represents the most complicated protocol stack, which needs to be supported by the parser.
The example of P4's Header Format and Packet Parser specification is shown in the following
text. The Example a) shows the definition of the Ethernet header. The description simply lists
all fields of the packet header and their width in bits. The Example b) shows the definition of
Packet Parser's state which is used for processing of Ethernet header. The provided example
consists of the switch and extract statement. The extract statement instruments the parser to
examine input packets and look for data defined in the header. Parsed data is then used in the
switch statement to determine the next state (protocol) to process.
BEBA/WP2 – D2.4 Version: 1.0 Page 43 of 49 Beba
BEBA
Behavioural Based
Forwarding Grant Agreement: 644122
BEhavioural BAsed forwarding
Example a)
Example b)
header ethernet {
header ethernet eth;
fields {
parser ethernet {
dst_addr : 48;
extract(eth);
src_addr : 48;
switch(eth.ethertype) {
ethertype : 16;
case 0x8100,0x9100: vlan;
}
case 0x800: ipv4;
}
}
}
3.1.3 Transformation algorithm We introduced the main idea for transformation of P4 source code to HFE M2 architecture.
Following text provides a pseudocode of transformation algorithm, together with the time
complexity analysis.
Whole transformation algorithm is defined like following:
Function TransformationToParser(prog)
Input: prog = P4 program
Result: VHDL code of the parser architecture
begin
/* 1) Identify parser graph representation (use the DFS algorithm) */
graph = GetParserGraphRepresentation(prog);
/* 2) Mark all nodes as Fresh. After that, traverse through the graph and
identify level of each node (use the DFS algorithm). Level of the node = position in the
processing chain */
MarkFresh(graph);
FindNodeLevels(graph.root,0 );
/* 3) Generate protocol analyzers and connect them to the processing chain */
BEBA/WP2 – D2.4 Version: 1.0 Page 44 of 49 Beba
BEBA
Behavioural Based
Forwarding Grant Agreement: 644122
BEhavioural BAsed forwarding
return GenerateProcessingPipeline(graph);
end
Function FindNodeLevels(node, curr_level)
Input: node = actual node to process
Input: curr_level = actual level of the node
begin
if node.fresh == False then
return;
end
/* Mark the node as not fresh and update the level */
node.fresh = False;
act_level = node.get level();
if act_level < curr_level then
node.set_level(curr_level);
end
/* For all fresh successors, update the level and call the same function */
node_successors = node.get_next_states();
for next_node in node_successors do
/* Don’t call the node if the longest path already exists */
if next_node.get_level() - node.get_level() < 1 then
FindNodeLevels(next_node,curr level+1);
end /* End of if */
end /* End of for */
end
The time complexity
components:
of
the
proposed
transformation
algorithm
consists
of following
1.
GetParserGraphRepresentation’s time complexity is equal to O(V+E) (DFS
algorithm), where V is the number of nodes and E is the number of edges. In general, maximal
number of edges in acyclic graph is equal to n/2*(n-1) where n is the number of protocol
analyzers (i.e., nodes of PGR). Total time complexity of DFS is O(n + n/2* (n − 1)) ~ O(n2).
BEBA/WP2 – D2.4 Version: 1.0 Page 45 of 49 BEBA
Behavioural Based
Forwarding Grant Agreement: 644122
Beba
BEhavioural BAsed forwarding
2.
FindLongestPaths’s time complexity is equal to O(n2) (DFS algorithm), where n is the
number of protocol analyzers.
3.
MarkFresh’s time complexity is equal to O(n), where n is a number of protocols (i.e.,
nodes of PGR).
4.
GenerateProcessingPipeline’s time complexity is equal to O(n) because we are
generating a processing chain with n protocol analyzers.
Total time complexity of the transformation is O(n2)+O(n2)+ O(n) + O(n) = O(2 * n2) + O(2 *
n) ~ O(n2).
3.1.4 Reached results In this section, we introduce results for parsers. We have tested properties of generated
parsers with two different protocol stacks:
• simple L2 - Ethernet, IPv4/IPv6 (with 2×extension headers), TCP/UDP, ICMP/ICMPv6
• full - Ethernet, 2×VLAN, 2×MPLS, IPv4/IPv6 (with 2×extension headers), TCP/UDP,
ICMP/ICMPv6
We use the Slice Logic (number of used LUTs plus FlipFlops) as a metric of resource utilization
because these parts are the most utilized in FPGA designs. We provide results after synthesis
for the Xilinx Virtex-7 XCVH580T FPGA using the Xilinx Vivado 2015.1 design tool. All parsers
were synthesized with different settings of the data width (256 and 512 bits) and the presence
of pipeline modules. These settings, together with the resulting frequency, latency and
resource usage, generate a large space of possible solutions for each P4 based parser. These
solutions were searched for Pareto set which allows us to pick the best-fitting solution for an
application.
The searched space contains all possible combinations in the case of simple L2 protocol stack.
In the case of full parser, there are 214 possible solutions. It can be time exhausting to
synthesize all of them. Therefore, we have randomly selected 20% of all possible solutions,
which allows us to briefly inspect properties of generated processing chain in a reasonable
time.
For comparison of the achieved Pareto set results for different protocol stacks, we provide
following graphs: Figure 6 (optimized for throughput and FPGA resources) and Figure 7
(optimized for throughput and latency). The Pareto sets show the best achievable solutions for
our parsers. From these figures, we can see that supported protocol stack can significantly
change parameters of the parser in terms of FPGA resources and latency.
The proposed results show that: (1) we can generate parsers which are capable to process
data at speeds beyond 100 Gbps and (2) we are capable to generate them up to two seconds.
BEBA/WP2 – D2.4 Version: 1.0 Page 46 of 49 BEBA
Behavioural Based
Forwarding Grant Agreement: 644122
Beba
BEhavioural BAsed forwarding
Figure 6: Comparison of the FPGA resource utilization versus throughput Pareto sets for the tested protocol stacks
Figure 7: Comparison of the latency versus throughput Pareto sets for the tested protocol stacks
BEBA/WP2 – D2.4 Version: 1.0 Page 47 of 49 Beba
BEBA
Behavioural Based
Forwarding Grant Agreement: 644122
BEhavioural BAsed forwarding
References [1]
ofsoftswitch13 project homepage: https://github.com/CPqD/ofsoftswitch13
[2]
Ryu project homepage: http://osrg.github.io/ryu/
[3]
“Virtex-7 Family Overview,” http://www.xilinx.com
[4]
B. Jean-Louis, “Using block RAM for high performance read/write TCAMs,” 2012.
[5]
Z. Ullah, M. Jaiswal, Y. Chan, and R. Cheung, “FPGA Implementation of SRAM-based
Ternary Content Addressable Memory,” in IEEE 26th International Parallel and
Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), 2012.
[6]
W. Jiang, “Scalable ternary content addressable memory implementation using
FPGAs,” in Architectures for Networking and Communications Systems (ANCS), 2013
ACM/IEEE Symposium on, 2013, pp. 71–82.
[7]
Open Networking Foundation, “OpenFlow Switch Specification ver 1.4,” Oct. 2013.
[8]
P. Bosshart, G. Gibb, H.-S. Kim, G. Varghese, N. McKeown, M. Izzard, F. Mujica, and
M. Horowitz, “Forwarding metamorphosis: Fast programmable match-action
processing in hardware for sdn,” in ACM SIGCOMM Conference, 2013, pp. 99–110.
[9]
A. Sivaraman, A. Cheung, M. Budiu, C. Kim, M. Alizadeh, H. Balakrishnan, G.
Varghese, N. McKeown, and S. Licking, “Packet transactions: High-level programming
for line-rate switches,” in Proceedings of the 2016 Conference on ACM SIGCOMM
2016 Conference, ser. SIGCOMM ’16. ACM, 2016, pp. 15–28. [Online]. Available:
http://doi.acm.org/10.1145/2934872.2934900
[10]
“ARM Cortex-M0 Processor,”
https://www.arm.com/products/processors/cortexm/cortex-m0.php.
[11]
“Synopsys DesignWare ARC EM4 Processor Core,”
https://www.synopsys.com/dw/ipdir.php?ds=arc em4 processor
[12]
P. Bosshart, D. Daly, G. Gibb, M. Izzard, N. McKeown, J. Rexford, C. Schlesinger, D.
Talayco, A. Vahdat, G. Varghese, and D. Walker. P4: Programming protocolindependent packet processors. SIGCOMM Comput. Commun. Rev., 44(3):87–95,
July 2014
[13]
P4 Language Consortium. P4. http://p4.org/
[14]
P4 Language Consortium. P4-HLIR. https://github.com/p4lang/p4-hlir
[15]
P4 Language
behavioral
[16]
P4 Language Consortium. P4-GRAPHS. https://github.com/p4lang/p4c-graphs
[17]
M. Attig and G. Brebner. 400 gb/s programmable packet parsing on a single fpga. In
In Proceedings on the 2011 ACMJIEEE Seventh Symposium on Architectures for
Networking and Communications Systems, ANCS ’11, pages 12–23. IEEE Computer
Society, 2011
[18]
V. Pus, L. Kekely, and J. Korenek. Low-latency modular packet header parser for
FPGA. In Proceedings of the Eighth ACM/IEEE Symposium on Architectures for
Networking and Communications Systems, ANCS’12, pages 77–78, New York, NY,
BEBA/WP2 – D2.4 Consortium.
P4C-BEHAVIORAL.
Version: 1.0 https://github.com/p4lang/p4c-
Page 48 of 49 BEBA
Behavioural Based
Forwarding Grant Agreement: 644122
Beba
BEhavioural BAsed forwarding
USA, 2012. ACM
[19]
F. Risso and M. Baldi. Netpdl: An extensible xml-based language for packet header
description. Comput. Netw., 50(5):688–706, Apr. 2006.
BEBA/WP2 – D2.4 Version: 1.0 Page 49 of 49