(beta) torch.compile과 함께 TORCH_LOGS 파이썬 API 사용하기#

저자: Michael Lazos 번역: 장효영

import logging

This tutorial introduces the TORCH_LOGS environment variable, as well as the Python API, and demonstrates how to apply it to observe the phases of torch.compile. 이 튜토리얼에서는 TORCH_LOGS 환경 변수와 함께 Python API를 소개하고, 이를 적용하여 ``torch.compile``의 단계를 관찰하는 방법을 보여줍니다.

참고

이 튜토리얼에는 PyTorch 2.2.0 이상 버전이 필요합니다.

설정#

In this example, we’ll set up a simple Python function which performs an elementwise add and observe the compilation process with TORCH_LOGS Python API. 이 예제에서는 요소별 덧셈을 수행하는 간단한 파이썬 함수를 설정하고 TORCH_LOGS 파이썬 API를 사용하여 컴파일 프로세스를 관찰해 보겠습니다.

참고

명령줄에서 로깅 설정을 변경하는 데 사용할 수 있는 환경 변수 ``TORCH_LOGS``도 있습니다. 각 예제에 해당하는 환경 변수 설정이 표시되어 있습니다.

import torch

# torch.compile을 지원하지 않는 기기인 경우 완전히 종료합니다.
if torch.cuda.get_device_capability() < (7, 0):
    print("Skipping because torch.compile is not supported on this device.")
else:
    @torch.compile()
    def fn(x, y):
        z = x + y
        return z + 2


    inputs = (torch.ones(2, 2, device="cuda"), torch.zeros(2, 2, device="cuda"))


# 각 예제 사이의 구분 기호를 출력하고 dynamo를 reset합니다
    def separator(name):
        print(f"==================={name}=========================")
        torch._dynamo.reset()


    separator("Dynamo Tracing")
# dynamo tracing 보기
# TORCH_LOGS="+dynamo"
    torch._logging.set_logs(dynamo=logging.DEBUG)
    fn(*inputs)

    separator("Traced Graph")
# traced 그래프 보기
# TORCH_LOGS="graph"
    torch._logging.set_logs(graph=True)
    fn(*inputs)

    separator("Fusion Decisions")
# fusion decision 보기
# TORCH_LOGS="fusion"
    torch._logging.set_logs(fusion=True)
    fn(*inputs)

    separator("Output Code")
# inductor가 생성한 결과 코드 보기
# TORCH_LOGS="output_code"
    torch._logging.set_logs(output_code=True)
    fn(*inputs)

    separator("")

===================Dynamo Tracing=========================
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0] torchdynamo start compiling fn /workspace/tutorials-kr/recipes_source/torch_logs.py:44, stack (elided 5 frames):
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0]   File "/opt/conda/bin/sphinx-build", line 7, in <module>
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0]     sys.exit(main())
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0]   File "/opt/conda/lib/python3.11/site-packages/sphinx/cmd/build.py", line 339, in main
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0]     return make_main(argv)
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0]   File "/opt/conda/lib/python3.11/site-packages/sphinx/cmd/build.py", line 213, in make_main
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0]     return make_mode.run_make_mode(argv[1:])
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0]   File "/opt/conda/lib/python3.11/site-packages/sphinx/cmd/make_mode.py", line 181, in run_make_mode
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0]     return make.run_generic_build(args[0])
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0]   File "/opt/conda/lib/python3.11/site-packages/sphinx/cmd/make_mode.py", line 169, in run_generic_build
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0]     return build_main(args + opts)
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0]   File "/opt/conda/lib/python3.11/site-packages/sphinx/cmd/build.py", line 293, in build_main
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0]     app = Sphinx(args.sourcedir, args.confdir, args.outputdir,
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0]   File "/opt/conda/lib/python3.11/site-packages/sphinx/application.py", line 272, in __init__
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0]     self._init_builder()
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0]   File "/opt/conda/lib/python3.11/site-packages/sphinx/application.py", line 343, in _init_builder
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0]     self.events.emit('builder-inited')
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0]   File "/opt/conda/lib/python3.11/site-packages/sphinx/events.py", line 97, in emit
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0]     results.append(listener.handler(self.app, *args))
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0]   File "/opt/conda/lib/python3.11/site-packages/sphinx_gallery/gen_gallery.py", line 757, in generate_gallery_rst
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0]     ) = generate_dir_rst(
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0]   File "/opt/conda/lib/python3.11/site-packages/sphinx_gallery/gen_rst.py", line 606, in generate_dir_rst
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0]     results = parallel(
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0]   File "/opt/conda/lib/python3.11/site-packages/sphinx_gallery/gen_rst.py", line 607, in <genexpr>
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0]     p_fun(fname, target_dir, src_dir, gallery_conf) for fname in iterator
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0]   File "/workspace/tutorials-kr/conf.py", line 86, in wrapper
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0]     p.start()
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0]   File "/opt/conda/lib/python3.11/multiprocessing/process.py", line 121, in start
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0]     self._popen = self._Popen(self)
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0]   File "/opt/conda/lib/python3.11/multiprocessing/context.py", line 224, in _Popen
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0]     return _default_context.get_context().Process._Popen(process_obj)
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0]   File "/opt/conda/lib/python3.11/multiprocessing/context.py", line 281, in _Popen
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0]     return Popen(process_obj)
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0]   File "/opt/conda/lib/python3.11/multiprocessing/popen_fork.py", line 19, in __init__
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0]     self._launch(process_obj)
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0]   File "/opt/conda/lib/python3.11/multiprocessing/popen_fork.py", line 71, in _launch
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0]     code = process_obj._bootstrap(parent_sentinel=child_r)
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0]   File "/opt/conda/lib/python3.11/multiprocessing/process.py", line 314, in _bootstrap
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0]     self.run()
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0]   File "/opt/conda/lib/python3.11/multiprocessing/process.py", line 108, in run
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0]     self._target(*self._args, **self._kwargs)
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0]   File "/workspace/tutorials-kr/conf.py", line 74, in call_fn
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0]     result = func(*args, **kwargs)
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0]   File "/opt/conda/lib/python3.11/site-packages/sphinx_gallery/gen_rst.py", line 1374, in generate_file_rst
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0]     output_blocks, time_elapsed = execute_script(
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0]   File "/opt/conda/lib/python3.11/site-packages/sphinx_gallery/gen_rst.py", line 1192, in execute_script
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0]     execute_code_block(
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0]   File "/opt/conda/lib/python3.11/site-packages/sphinx_gallery/gen_rst.py", line 1048, in execute_code_block
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0]     is_last_expr, mem_max = _exec_and_get_memory(
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0]   File "/opt/conda/lib/python3.11/site-packages/sphinx_gallery/gen_rst.py", line 876, in _exec_and_get_memory
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0]     mem_max, _ = call_memory(
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0]   File "/opt/conda/lib/python3.11/site-packages/sphinx_gallery/gen_rst.py", line 1725, in _sg_call_memory_noop
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0]     return 0.0, func()
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0]   File "/opt/conda/lib/python3.11/site-packages/sphinx_gallery/gen_rst.py", line 794, in __call__
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0]     exec(self.code, self.fake_main.__dict__)
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0]   File "/workspace/tutorials-kr/recipes_source/torch_logs.py", line 63, in <module>
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0]     fn(*inputs)
V1004 00:38:40.534000 3764082 site-packages/torch/_dynamo/convert_frame.py:1055] [0/0]
I1004 00:38:40.536000 3764082 site-packages/torch/_dynamo/symbolic_convert.py:3320] [0/0] Step 1: torchdynamo start tracing fn /workspace/tutorials-kr/recipes_source/torch_logs.py:44
I1004 00:38:40.537000 3764082 site-packages/torch/fx/experimental/symbolic_shapes.py:3767] [0/0] create_env
V1004 00:38:40.540000 3764082 site-packages/torch/_dynamo/symbolic_convert.py:1237] [0/0] [__trace_source] TRACE starts_line /workspace/tutorials-kr/recipes_source/torch_logs.py:44 in fn ()
V1004 00:38:40.540000 3764082 site-packages/torch/_dynamo/symbolic_convert.py:1237] [0/0] [__trace_source]         @torch.compile()
V1004 00:38:40.541000 3764082 site-packages/torch/_dynamo/symbolic_convert.py:1260] [0/0] [__trace_bytecode] TRACE RESUME 0 []
V1004 00:38:40.541000 3764082 site-packages/torch/_dynamo/symbolic_convert.py:1237] [0/0] [__trace_source] TRACE starts_line /workspace/tutorials-kr/recipes_source/torch_logs.py:46 in fn (fn)
V1004 00:38:40.541000 3764082 site-packages/torch/_dynamo/symbolic_convert.py:1237] [0/0] [__trace_source]             z = x + y
V1004 00:38:40.541000 3764082 site-packages/torch/_dynamo/symbolic_convert.py:1260] [0/0] [__trace_bytecode] TRACE LOAD_FAST x []
V1004 00:38:40.541000 3764082 site-packages/torch/_dynamo/symbolic_convert.py:1260] [0/0] [__trace_bytecode] TRACE LOAD_FAST y [LazyVariableTracker()]
V1004 00:38:40.542000 3764082 site-packages/torch/_dynamo/symbolic_convert.py:1260] [0/0] [__trace_bytecode] TRACE BINARY_OP 0 [LazyVariableTracker(), LazyVariableTracker()]
V1004 00:38:40.543000 3764082 site-packages/torch/_dynamo/variables/builder.py:3373] [0/0] wrap_to_fake L['x'] (2, 2) StatefulSymbolicContext(dynamic_sizes=[<DimDynamic.STATIC: 2>, <DimDynamic.STATIC: 2>], dynamic_strides=[<DimDynamic.INFER_STRIDE: 4>, <DimDynamic.INFER_STRIDE: 4>], constraint_sizes=[None, None], constraint_strides=[None, None], specialize_on=[[], []], view_base_context=None, tensor_source=LocalSource(local_name='x', is_input=True, dynamism=None, is_derefed_cell_contents=False), shape_env_to_source_to_symbol_cache={}) <class 'torch.Tensor'>
V1004 00:38:40.544000 3764082 site-packages/torch/_dynamo/output_graph.py:2614] [0/0] create_graph_input L_x_ L['x'] FakeTensor(..., device='cuda:0', size=(2, 2)) at debug_level 0 before=False
V1004 00:38:40.545000 3764082 site-packages/torch/_dynamo/variables/builder.py:3373] [0/0] wrap_to_fake L['y'] (2, 2) StatefulSymbolicContext(dynamic_sizes=[<DimDynamic.STATIC: 2>, <DimDynamic.STATIC: 2>], dynamic_strides=[<DimDynamic.INFER_STRIDE: 4>, <DimDynamic.INFER_STRIDE: 4>], constraint_sizes=[None, None], constraint_strides=[None, None], specialize_on=[[], []], view_base_context=None, tensor_source=LocalSource(local_name='y', is_input=True, dynamism=None, is_derefed_cell_contents=False), shape_env_to_source_to_symbol_cache={}) <class 'torch.Tensor'>
V1004 00:38:40.546000 3764082 site-packages/torch/_dynamo/output_graph.py:2614] [0/0] create_graph_input L_y_ L['y'] FakeTensor(..., device='cuda:0', size=(2, 2)) at debug_level 0 before=False
V1004 00:38:40.547000 3764082 site-packages/torch/_dynamo/output_graph.py:2462] [0/0] [__trace_call] TRACE FX call add from /workspace/tutorials-kr/recipes_source/torch_logs.py:46 in fn (fn)
V1004 00:38:40.547000 3764082 site-packages/torch/_dynamo/output_graph.py:2462] [0/0] [__trace_call]         z = x + y
V1004 00:38:40.547000 3764082 site-packages/torch/_dynamo/output_graph.py:2462] [0/0] [__trace_call]             ~~^~~
V1004 00:38:40.549000 3764082 site-packages/torch/_dynamo/symbolic_convert.py:1260] [0/0] [__trace_bytecode] TRACE STORE_FAST z [TensorVariable()]
V1004 00:38:40.549000 3764082 site-packages/torch/_dynamo/symbolic_convert.py:1237] [0/0] [__trace_source] TRACE starts_line /workspace/tutorials-kr/recipes_source/torch_logs.py:47 in fn (fn)
V1004 00:38:40.549000 3764082 site-packages/torch/_dynamo/symbolic_convert.py:1237] [0/0] [__trace_source]             return z + 2
V1004 00:38:40.549000 3764082 site-packages/torch/_dynamo/symbolic_convert.py:1260] [0/0] [__trace_bytecode] TRACE LOAD_FAST z []
V1004 00:38:40.549000 3764082 site-packages/torch/_dynamo/symbolic_convert.py:1260] [0/0] [__trace_bytecode] TRACE LOAD_CONST 2 [TensorVariable()]
V1004 00:38:40.549000 3764082 site-packages/torch/_dynamo/symbolic_convert.py:1260] [0/0] [__trace_bytecode] TRACE BINARY_OP 0 [TensorVariable(), ConstantVariable(int: 2)]
V1004 00:38:40.550000 3764082 site-packages/torch/_dynamo/output_graph.py:2462] [0/0] [__trace_call] TRACE FX call add_1 from /workspace/tutorials-kr/recipes_source/torch_logs.py:47 in fn (fn)
V1004 00:38:40.550000 3764082 site-packages/torch/_dynamo/output_graph.py:2462] [0/0] [__trace_call]         return z + 2
V1004 00:38:40.550000 3764082 site-packages/torch/_dynamo/output_graph.py:2462] [0/0] [__trace_call]                ~~^~~
V1004 00:38:40.551000 3764082 site-packages/torch/_dynamo/symbolic_convert.py:1260] [0/0] [__trace_bytecode] TRACE RETURN_VALUE None [TensorVariable()]
I1004 00:38:40.551000 3764082 site-packages/torch/_dynamo/symbolic_convert.py:3648] [0/0] Step 1: torchdynamo done tracing fn (RETURN_VALUE)
V1004 00:38:40.551000 3764082 site-packages/torch/_dynamo/symbolic_convert.py:3652] [0/0] RETURN_VALUE triggered compile
V1004 00:38:40.552000 3764082 site-packages/torch/_dynamo/output_graph.py:1263] [0/0] COMPILING GRAPH due to GraphCompileReason(reason='return_value', user_stack=[<FrameSummary file /workspace/tutorials-kr/recipes_source/torch_logs.py, line 47 in fn>], graph_break=False)
V1004 00:38:40.554000 3764082 site-packages/torch/_dynamo/output_graph.py:1667] [0/0] [__graph_code] TRACED GRAPH
V1004 00:38:40.554000 3764082 site-packages/torch/_dynamo/output_graph.py:1667] [0/0] [__graph_code]  ===== __compiled_fn_1_2d30cbfe_4ba8_49a7_aef8_bc201073ee00 =====
V1004 00:38:40.554000 3764082 site-packages/torch/_dynamo/output_graph.py:1667] [0/0] [__graph_code]  /opt/conda/lib/python3.11/site-packages/torch/fx/_lazy_graph_module.py class GraphModule(torch.nn.Module):
V1004 00:38:40.554000 3764082 site-packages/torch/_dynamo/output_graph.py:1667] [0/0] [__graph_code]     def forward(self, L_x_: "f32[2, 2][2, 1]cuda:0", L_y_: "f32[2, 2][2, 1]cuda:0"):
V1004 00:38:40.554000 3764082 site-packages/torch/_dynamo/output_graph.py:1667] [0/0] [__graph_code]         l_x_ = L_x_
V1004 00:38:40.554000 3764082 site-packages/torch/_dynamo/output_graph.py:1667] [0/0] [__graph_code]         l_y_ = L_y_
V1004 00:38:40.554000 3764082 site-packages/torch/_dynamo/output_graph.py:1667] [0/0] [__graph_code]
V1004 00:38:40.554000 3764082 site-packages/torch/_dynamo/output_graph.py:1667] [0/0] [__graph_code]          # File: /workspace/tutorials-kr/recipes_source/torch_logs.py:46 in fn, code: z = x + y
V1004 00:38:40.554000 3764082 site-packages/torch/_dynamo/output_graph.py:1667] [0/0] [__graph_code]         z: "f32[2, 2][2, 1]cuda:0" = l_x_ + l_y_;  l_x_ = l_y_ = None
V1004 00:38:40.554000 3764082 site-packages/torch/_dynamo/output_graph.py:1667] [0/0] [__graph_code]
V1004 00:38:40.554000 3764082 site-packages/torch/_dynamo/output_graph.py:1667] [0/0] [__graph_code]          # File: /workspace/tutorials-kr/recipes_source/torch_logs.py:47 in fn, code: return z + 2
V1004 00:38:40.554000 3764082 site-packages/torch/_dynamo/output_graph.py:1667] [0/0] [__graph_code]         add_1: "f32[2, 2][2, 1]cuda:0" = z + 2;  z = None
V1004 00:38:40.554000 3764082 site-packages/torch/_dynamo/output_graph.py:1667] [0/0] [__graph_code]         return (add_1,)
V1004 00:38:40.554000 3764082 site-packages/torch/_dynamo/output_graph.py:1667] [0/0] [__graph_code]
V1004 00:38:40.554000 3764082 site-packages/torch/_dynamo/output_graph.py:1667] [0/0] [__graph_code]
I1004 00:38:40.555000 3764082 site-packages/torch/_dynamo/output_graph.py:1842] [0/0] Step 2: calling compiler function inductor
I1004 00:38:41.349000 3764082 site-packages/torch/fx/experimental/symbolic_shapes.py:5238] [0/0] produce_guards
I1004 00:38:41.351000 3764082 site-packages/torch/fx/experimental/symbolic_shapes.py:5238] [0/0] produce_guards
I1004 00:38:41.354000 3764082 site-packages/torch/_dynamo/output_graph.py:1847] [0/0] Step 2: done compiler function inductor
I1004 00:38:41.357000 3764082 site-packages/torch/fx/experimental/symbolic_shapes.py:5238] [0/0] produce_guards
V1004 00:38:41.357000 3764082 site-packages/torch/fx/experimental/symbolic_shapes.py:5458] [0/0] track_symint L['x'].size()[0] 2 None
V1004 00:38:41.357000 3764082 site-packages/torch/fx/experimental/symbolic_shapes.py:5458] [0/0] track_symint L['x'].size()[1] 2 None
V1004 00:38:41.357000 3764082 site-packages/torch/fx/experimental/symbolic_shapes.py:5458] [0/0] track_symint L['x'].stride()[0] 2 None
V1004 00:38:41.358000 3764082 site-packages/torch/fx/experimental/symbolic_shapes.py:5458] [0/0] track_symint L['x'].stride()[1] 1 None
V1004 00:38:41.358000 3764082 site-packages/torch/fx/experimental/symbolic_shapes.py:5458] [0/0] track_symint L['x'].storage_offset() 0 None
V1004 00:38:41.358000 3764082 site-packages/torch/fx/experimental/symbolic_shapes.py:5458] [0/0] track_symint L['y'].size()[0] 2 None
V1004 00:38:41.358000 3764082 site-packages/torch/fx/experimental/symbolic_shapes.py:5458] [0/0] track_symint L['y'].size()[1] 2 None
V1004 00:38:41.358000 3764082 site-packages/torch/fx/experimental/symbolic_shapes.py:5458] [0/0] track_symint L['y'].stride()[0] 2 None
V1004 00:38:41.358000 3764082 site-packages/torch/fx/experimental/symbolic_shapes.py:5458] [0/0] track_symint L['y'].stride()[1] 1 None
V1004 00:38:41.358000 3764082 site-packages/torch/fx/experimental/symbolic_shapes.py:5458] [0/0] track_symint L['y'].storage_offset() 0 None
V1004 00:38:41.359000 3764082 site-packages/torch/fx/experimental/symbolic_shapes.py:5679] [0/0] Skipping guard L['x'].size()[0] == 2
V1004 00:38:41.359000 3764082 site-packages/torch/fx/experimental/symbolic_shapes.py:5679] [0/0] Skipping guard L['x'].size()[1] == 2
V1004 00:38:41.359000 3764082 site-packages/torch/fx/experimental/symbolic_shapes.py:5679] [0/0] Skipping guard L['x'].stride()[0] == 2
V1004 00:38:41.359000 3764082 site-packages/torch/fx/experimental/symbolic_shapes.py:5679] [0/0] Skipping guard L['x'].stride()[1] == 1
V1004 00:38:41.360000 3764082 site-packages/torch/fx/experimental/symbolic_shapes.py:5679] [0/0] Skipping guard L['x'].storage_offset() == 0
V1004 00:38:41.360000 3764082 site-packages/torch/fx/experimental/symbolic_shapes.py:5679] [0/0] Skipping guard L['y'].size()[0] == 2
V1004 00:38:41.360000 3764082 site-packages/torch/fx/experimental/symbolic_shapes.py:5679] [0/0] Skipping guard L['y'].size()[1] == 2
V1004 00:38:41.360000 3764082 site-packages/torch/fx/experimental/symbolic_shapes.py:5679] [0/0] Skipping guard L['y'].stride()[0] == 2
V1004 00:38:41.360000 3764082 site-packages/torch/fx/experimental/symbolic_shapes.py:5679] [0/0] Skipping guard L['y'].stride()[1] == 1
V1004 00:38:41.360000 3764082 site-packages/torch/fx/experimental/symbolic_shapes.py:5679] [0/0] Skipping guard L['y'].storage_offset() == 0
V1004 00:38:41.360000 3764082 site-packages/torch/_dynamo/guards.py:3064] [0/0] [__guards] GUARDS:
V1004 00:38:41.361000 3764082 site-packages/torch/_dynamo/guards.py:2863] [0/0] [__guards]
V1004 00:38:41.361000 3764082 site-packages/torch/_dynamo/guards.py:2863] [0/0] [__guards] TREE_GUARD_MANAGER:
V1004 00:38:41.361000 3764082 site-packages/torch/_dynamo/guards.py:2863] [0/0] [__guards] +- RootGuardManager
V1004 00:38:41.361000 3764082 site-packages/torch/_dynamo/guards.py:2863] [0/0] [__guards] | +- LAMBDA_GUARD: torch._functorch.aot_autograd.utils.top_saved_tensors_hooks ids == None  # _dynamo/output_graph.py:633 in init_ambient_guards
V1004 00:38:41.361000 3764082 site-packages/torch/_dynamo/guards.py:2863] [0/0] [__guards] | +- DEFAULT_DEVICE: utils_device.CURRENT_DEVICE == None                           # _dynamo/output_graph.py:621 in init_ambient_guards
V1004 00:38:41.361000 3764082 site-packages/torch/_dynamo/guards.py:2863] [0/0] [__guards] | +- GLOBAL_STATE: ___check_global_state()
V1004 00:38:41.361000 3764082 site-packages/torch/_dynamo/guards.py:2863] [0/0] [__guards] | +- TORCH_FUNCTION_MODE_STACK: ___check_torch_function_mode_stack()
V1004 00:38:41.361000 3764082 site-packages/torch/_dynamo/guards.py:2863] [0/0] [__guards] | +- GuardManager: source=L['x'], accessed_by=FrameLocalsGuardAccessor(key='x', framelocals_idx=0)
V1004 00:38:41.361000 3764082 site-packages/torch/_dynamo/guards.py:2863] [0/0] [__guards] | | +- TENSOR_MATCH: check_tensor(L['x'], Tensor, DispatchKeySet(CUDA, BackendSelect, ADInplaceOrView, AutogradCUDA), torch.float32, device=0, requires_grad=False, size=[2, 2], stride=[2, 1])  # z = x + y  # orkspace/tutorials-kr/recipes_source/torch_logs.py:46 in fn
V1004 00:38:41.361000 3764082 site-packages/torch/_dynamo/guards.py:2863] [0/0] [__guards] | | +- NO_HASATTR: hasattr(L['x'], '_dynamo_dynamic_indices') == False           # z = x + y  # orkspace/tutorials-kr/recipes_source/torch_logs.py:46 in fn
V1004 00:38:41.361000 3764082 site-packages/torch/_dynamo/guards.py:2863] [0/0] [__guards] | | +- NO_TENSOR_ALIASING: check_no_aliasing(L['x'], L['y'])
V1004 00:38:41.361000 3764082 site-packages/torch/_dynamo/guards.py:2863] [0/0] [__guards] | +- GuardManager: source=L['y'], accessed_by=FrameLocalsGuardAccessor(key='y', framelocals_idx=1)
V1004 00:38:41.361000 3764082 site-packages/torch/_dynamo/guards.py:2863] [0/0] [__guards] | | +- TENSOR_MATCH: check_tensor(L['y'], Tensor, DispatchKeySet(CUDA, BackendSelect, ADInplaceOrView, AutogradCUDA), torch.float32, device=0, requires_grad=False, size=[2, 2], stride=[2, 1])  # z = x + y  # orkspace/tutorials-kr/recipes_source/torch_logs.py:46 in fn
V1004 00:38:41.361000 3764082 site-packages/torch/_dynamo/guards.py:2863] [0/0] [__guards] | | +- NO_HASATTR: hasattr(L['y'], '_dynamo_dynamic_indices') == False           # z = x + y  # orkspace/tutorials-kr/recipes_source/torch_logs.py:46 in fn
V1004 00:38:41.361000 3764082 site-packages/torch/_dynamo/guards.py:2863] [0/0] [__guards] | | +- NO_TENSOR_ALIASING
V1004 00:38:41.361000 3764082 site-packages/torch/_dynamo/guards.py:2863] [0/0] [__guards]
V1004 00:38:41.379000 3764082 site-packages/torch/_dynamo/guards.py:2894] [0/0] [__guards] Guard eval latency = 9.69 us
I1004 00:38:41.380000 3764082 site-packages/torch/_dynamo/pgo.py:785] [0/0] put_code_state: no cache key, skipping
I1004 00:38:41.380000 3764082 site-packages/torch/_dynamo/convert_frame.py:1175] [0/0] run_gc_after_compile: running gc
V1004 00:38:41.383000 3764082 site-packages/torch/_dynamo/convert_frame.py:1458] skipping: inner (reason: in skipfiles, file: /opt/conda/lib/python3.11/site-packages/torch/_compile.py)
V1004 00:38:41.384000 3764082 site-packages/torch/_dynamo/convert_frame.py:1458] skipping: disable (reason: in skipfiles, file: /opt/conda/lib/python3.11/site-packages/torch/_dynamo/decorators.py)
V1004 00:38:41.384000 3764082 site-packages/torch/_dynamo/convert_frame.py:1458] skipping: innermost_fn (reason: in skipfiles, file: /opt/conda/lib/python3.11/site-packages/torch/_dynamo/eval_frame.py)
V1004 00:38:41.384000 3764082 site-packages/torch/_dynamo/convert_frame.py:1458] skipping: __init__ (reason: in skipfiles, file: /opt/conda/lib/python3.11/site-packages/torch/_dynamo/eval_frame.py)
V1004 00:38:41.384000 3764082 site-packages/torch/_dynamo/convert_frame.py:1458] skipping: __init__ (reason: in skipfiles, file: /opt/conda/lib/python3.11/site-packages/torch/_dynamo/eval_frame.py)
V1004 00:38:41.385000 3764082 site-packages/torch/_dynamo/convert_frame.py:1458] skipping: nothing (reason: in skipfiles, file: /opt/conda/lib/python3.11/site-packages/torch/_dynamo/eval_frame.py)
V1004 00:38:41.385000 3764082 site-packages/torch/_dynamo/convert_frame.py:1458] skipping: __call__ (reason: in skipfiles, file: /opt/conda/lib/python3.11/site-packages/torch/_dynamo/eval_frame.py)
V1004 00:38:41.385000 3764082 site-packages/torch/_dynamo/convert_frame.py:1458] skipping: _fn (reason: in skipfiles, file: /opt/conda/lib/python3.11/site-packages/torch/_dynamo/eval_frame.py)
===================Traced Graph=========================
I1004 00:38:41.386000 3764082 site-packages/torch/_dynamo/__init__.py:118] torch._dynamo.reset
I1004 00:38:41.386000 3764082 site-packages/torch/_dynamo/__init__.py:151] torch._dynamo.reset_code_caches
===================Fusion Decisions=========================
===================Output Code=========================
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] Output code:
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] # AOT ID: ['0_inference']
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] from ctypes import c_void_p, c_long, c_int
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] import torch
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] import math
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] import random
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] import os
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] import tempfile
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] from math import inf, nan
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] from cmath import nanj
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] from torch._inductor.hooks import run_intermediate_hooks
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] from torch._inductor.utils import maybe_profile
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] from torch._inductor.codegen.memory_planning import _align as align
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] from torch import device, empty_strided
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] from torch._inductor.async_compile import AsyncCompile
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] from torch._inductor.select_algorithm import extern_kernels
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] import triton
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] import triton.language as tl
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] from torch._inductor.runtime.triton_heuristics import start_graph, end_graph
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] from torch._C import _cuda_getCurrentRawStream as get_raw_stream
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] from torch._C import _cuda_getCurrentRawStream as get_raw_stream
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code]
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] aten = torch.ops.aten
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] inductor_ops = torch.ops.inductor
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] _quantized = torch.ops._quantized
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] assert_size_stride = torch._C._dynamo.guards.assert_size_stride
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] assert_alignment = torch._C._dynamo.guards.assert_alignment
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] empty_strided_cpu = torch._C._dynamo.guards._empty_strided_cpu
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] empty_strided_cuda = torch._C._dynamo.guards._empty_strided_cuda
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] empty_strided_xpu = torch._C._dynamo.guards._empty_strided_xpu
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] reinterpret_tensor = torch._C._dynamo.guards._reinterpret_tensor
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] alloc_from_pool = torch.ops.inductor._alloc_from_pool
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] async_compile = AsyncCompile()
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] empty_strided_p2p = torch._C._distributed_c10d._SymmetricMemory.empty_strided_p2p
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code]
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code]
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] # kernel path: /tmp/torchinductor_root/bc/cbcra7zetpvpubjqpgzmr2fv6olpakamutgikukl3oc6f6jb3qbh.py
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] # Topologically Sorted Source Nodes: [z, add_1], Original ATen: [aten.add]
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] # Source node to ATen node mapping:
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] #   add_1 => add_1
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] #   z => add
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] # Graph fragment:
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] #   %add : [num_users=1] = call_function[target=torch.ops.aten.add.Tensor](args = (%arg0_1, %arg1_1), kwargs = {})
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] #   %add_1 : [num_users=1] = call_function[target=torch.ops.aten.add.Tensor](args = (%add, 2), kwargs = {})
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] triton_poi_fused_add_0 = async_compile.triton('triton_poi_fused_add_0', '''
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] import triton
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] import triton.language as tl
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code]
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] from torch._inductor.runtime import triton_helpers, triton_heuristics
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] from torch._inductor.runtime.triton_helpers import libdevice, math as tl_math
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] from torch._inductor.runtime.hints import AutotuneHint, ReductionHint, TileHint, DeviceProperties
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] triton_helpers.set_driver_to_gpu()
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code]
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] @triton_heuristics.pointwise(
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code]     size_hints={'x': 4},
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code]     filename=__file__,
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code]     triton_meta={'signature': {'in_ptr0': '*fp32', 'in_ptr1': '*fp32', 'out_ptr0': '*fp32', 'xnumel': 'i32', 'XBLOCK': 'constexpr'}, 'device': DeviceProperties(type='cuda', index=0, multi_processor_count=132, cc=90, major=9, regs_per_multiprocessor=65536, max_threads_per_multi_processor=2048, warp_size=32), 'constants': {}, 'configs': [{(0,): [['tt.divisibility', 16]], (1,): [['tt.divisibility', 16]], (2,): [['tt.divisibility', 16]]}]},
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code]     inductor_meta={'grid_type': 'Grid1D', 'autotune_hints': set(), 'kernel_name': 'triton_poi_fused_add_0', 'mutated_arg_names': [], 'optimize_mem': True, 'no_x_dim': False, 'num_load': 2, 'num_reduction': 0, 'backend_hash': 'AD014388F727234BBC364D4F9312DA1C72DBEFEDA247CD785958FB6EB1138CAC', 'are_deterministic_algorithms_enabled': False, 'assert_indirect_indexing': True, 'autotune_local_cache': True, 'autotune_pointwise': True, 'autotune_remote_cache': None, 'force_disable_caches': False, 'dynamic_scale_rblock': True, 'max_autotune': False, 'max_autotune_pointwise': False, 'min_split_scan_rblock': 256, 'spill_threshold': 16, 'store_cubin': False, 'tiling_scores': {'x': 32}},
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code]     min_elem_per_thread=0
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] )
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] @triton.jit
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] def triton_poi_fused_add_0(in_ptr0, in_ptr1, out_ptr0, xnumel, XBLOCK : tl.constexpr):
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code]     xnumel = 4
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code]     xoffset = tl.program_id(0) * XBLOCK
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code]     xindex = xoffset + tl.arange(0, XBLOCK)[:]
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code]     xmask = xindex < xnumel
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code]     x0 = xindex
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code]     tmp0 = tl.load(in_ptr0 + (x0), xmask)
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code]     tmp1 = tl.load(in_ptr1 + (x0), xmask)
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code]     tmp2 = tmp0 + tmp1
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code]     tmp3 = 2.0
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code]     tmp4 = tmp2 + tmp3
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code]     tl.store(out_ptr0 + (x0), tmp4, xmask)
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] ''', device_str='cuda')
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code]
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code]
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] async_compile.wait(globals())
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] del async_compile
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code]
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] def call(args):
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code]     arg0_1, arg1_1 = args
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code]     args.clear()
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code]     assert_size_stride(arg0_1, (2, 2), (2, 1))
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code]     assert_size_stride(arg1_1, (2, 2), (2, 1))
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code]     with torch.cuda._DeviceGuard(0):
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code]         torch.cuda.set_device(0)
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code]         buf0 = empty_strided_cuda((2, 2), (2, 1), torch.float32)
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code]         # Topologically Sorted Source Nodes: [z, add_1], Original ATen: [aten.add]
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code]         stream0 = get_raw_stream(0)
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code]         triton_poi_fused_add_0.run(arg0_1, arg1_1, buf0, 4, stream=stream0)
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code]         del arg0_1
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code]         del arg1_1
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code]     return (buf0, )
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code]
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code]
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] def benchmark_compiled_module(times=10, repeat=10):
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code]     from torch._dynamo.testing import rand_strided
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code]     from torch._inductor.utils import print_performance
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code]     arg0_1 = rand_strided((2, 2), (2, 1), device='cuda:0', dtype=torch.float32)
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code]     arg1_1 = rand_strided((2, 2), (2, 1), device='cuda:0', dtype=torch.float32)
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code]     fn = lambda: call([arg0_1, arg1_1])
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code]     return print_performance(fn, times=times, repeat=repeat)
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code]
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code]
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code] if __name__ == "__main__":
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code]     from torch._inductor.wrapper_benchmark import compiled_module_main
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code]     compiled_module_main('None', benchmark_compiled_module)
V1004 00:38:41.476000 3764082 site-packages/torch/_inductor/codecache.py:1236] [0/0] [__output_code]
V1004 00:38:41.483000 3764082 site-packages/torch/_inductor/codecache.py:1237] [0/0] [__output_code] Output code written to: /tmp/torchinductor_root/5h/c5hssiemdwuea5ppzag4e75dgp4oh5pub4qbdo4lnqgzy5jc55jn.py
============================================

결론#

이 튜토리얼에서는 사용 가능한 몇 가지 로깅 옵션을 실험하여 TORCH_LOGS 환경 변수와 Python API를 소개했습니다. 사용 가능한 모든 옵션에 대한 설명을 보려면 파이썬 스크립트에서 import torch를 실행하고 TORCH_LOGS를 “help”로 설정하세요.

다른 방법으로는, torch._logging 문서 를 보면, 사용 가능한 모든 로깅 옵션에 대한 설명을 확인할 수 있습니다.

torch.compile에 관한 더 많은 정보는, `torch.compile 튜토리얼`_를 보세요.

Total running time of the script: (0 minutes 7.062 seconds)

(beta) torch.compile과 함께 TORCH_LOGS 파이썬 API 사용하기#

설정#

결론#

PyTorchKorea @ GitHub

한국어 튜토리얼

한국어 커뮤니티