V2EX = way to explore
V2EX 是一个关于分享和探索的地方
Sign Up Now
For Existing Member  Sign In
etoahn
V2EX  ›  算法

有什么高效的算法对 url 中的路径参数做模式识别分类么

  •  
  •   etoahn · Feb 3, 2023 · 2054 views
    This topic created in 1185 days ago, the information mentioned may be changed or developed.

    比如我有以下 url(输入)

    https://www.showcase.com/user/home
    
    https://www.showcase.com/bill/BlKLSJDFLJERSDF
    https://www.showcase.com/bill/BSERlKLSSDFEJSDF
    https://www.showcase.com/bill/BSDREWRDF
    https://www.showcase.com/bill/BSERDWEDFEJSDF # 类似 url 可能有 100+个
    
    https://www.showcase.com/bill/BlKLSJDFLJERSDF/detail
    https://www.showcase.com/bill/BSERlKLSSDFEJSDF/detail
    https://www.showcase.com/bill/BSDREWRDF/detail
    https://www.showcase.com/bill/BSERDWEDFEJSDF/detail # 类似 url 可能有 100+个
    
    
    https://www.showcase.com/topic/234566833245234566
    https://www.showcase.com/topic/200000234523456683
    https://www.showcase.com/topic/2586683567243w56324 # 类似 url 可能有 100+个
    
    
    # 其它大量 url , 正则规则不固定,只能通过统计分析
    
    

    分类为(输出)

    https://www.showcase.com/user/home
    https://www.showcase.com/bill/{param} 
    https://www.showcase.com/bill/{param}/detail
    https://www.showcase.com/topic/{param}
    

    暂时只想到用模式识别, 不知大佬有无其它方法

    4 replies    2023-02-03 13:43:41 +08:00
    Coderuancun
        1
    Coderuancun  
       Feb 3, 2023
    分词处理,有那种分词处理算法
    acmerliu
        2
    acmerliu  
       Feb 3, 2023
    隐马尔可夫
    Jooooooooo
        3
    Jooooooooo  
       Feb 3, 2023
    这不是正则吗
    34127chi
        4
    34127chi  
       Feb 3, 2023
    这不是正则吗
    About   ·   Help   ·   Advertise   ·   Blog   ·   API   ·   FAQ   ·   Solana   ·   2387 Online   Highest 6679   ·     Select Language
    创意工作者们的社区
    World is powered by solitude
    VERSION: 3.9.8.5 · 35ms · UTC 04:15 · PVG 12:15 · LAX 21:15 · JFK 00:15
    ♥ Do have faith in what you're doing.