白丝美女被狂躁免费视频网站,500av导航大全精品,yw.193.cnc爆乳尤物未满,97se亚洲综合色区,аⅴ天堂中文在线网官网

Method for HTTP-based access point fingerprint and classification using machine learning

專利號(hào)
US11399288B2
公開(kāi)日期
2022-07-26
申請(qǐng)人
SAMSUNG ELETR?NICA DA AMAZ?NIA LTDA.(BR Campinas)
發(fā)明人
Igor Jochem Sanz
IPC分類
H04W12/122; G06N20/20; H04L9/40; H04W12/60; H04W12/79
技術(shù)領(lǐng)域
http,ap,packet,html,captive,header,server,malicious,phishing,portal
地域: S?o Paulo

摘要

A method for HyperText Transfer Protocol (HTTP) based fingerprint and classification. The method includes training a HTTP-based machine-learning model, using machine-learning training techniques and a historical dataset of labelled Access Point HTTP service response features collected. The method is useful to detect benign or malicious classes, to assess the potential trustworthiness, to detect any type of bad behavior of an HTTP server, and any other threats that modify or implement an AP HTTP server or webpage. The method takes advantage of the captive portal detection packet exchange between a station and an Access Point (AP) to passively classify the AP.

說(shuō)明書(shū)

FIG. 4 shows the workflow of off-line machine-learning model generation process. Model generation starts with a historical dataset of AP HTTP responses raw data (401) that may be synthetically generated, environment-controlled generated or collected using invention method if AP class is known. The raw data might be in network dump format, structured or non-structured format, text format, or any file format that contains the necessary information to extract the selected features (402). HTTP response data may or not contain the information of its respective order on the HTTP redirect chain when packet was captured. If data contain packet order information, it may be used to tailor specific models for each packet respective order, or used as weight function during the final combination of model results. Then, AP HTTP responses are labelled according to the objective of classification model (403). Next, different types of feature extractors will convert all labelled AP response raw data (404) into the numerical features. The feature extraction method (405) includes an HTTP header feature extractor, suitable for all packets, an HTTP content feature extractor which may be used or not depending if packet has sufficient content data, which may defined by an HTML data size threshold, and a NLP feature extractor which will only be executed over the last packet responses of a redirect chain. After feature extraction process, data is now represented as feature vectors (406) which serve as input for off-line model training algorithms (407). Off-line model training algorithms refers to any learning algorithm that may rely on clustering, classification, distance-based or other machine-learning technique, aimed to create a model that generalizes properties and patterns of historical data to apply predictions or classifications on future data. The training process may also include the steps of data cleaning, data cleansing, data filtering, feature selection, feature reduction, normalization, standardization, cross-validation, among other pre-processing and post-processing techniques. The off-line model training process might generate several models with its respective results (408), in which the best model may be selected according to an evaluation criterion. If packet order is considered in the model generation process, input data will be split according to its order to create different machine-learning models corresponding to each specific packet order in the HTTP redirect chain. The final machine-learning models generated (409) may be represented as a binary file, object file, parameter values in case of parametric models, weights, text description, or any type or combination of data files that entirely represent a machine-learning model.

權(quán)利要求

1
微信群二維碼
意見(jiàn)反饋